How to drop pandas dataframe columns containing special characters

Question

How do I drop pandas dataframe columns that contains special characters such as @ / ] [ } { - _ etc.?

For example I have the following dataframe (called df):

I need to drop the columns Name and Matchkey becasue they contain some special characters. Also, how can I specify a list of special characters based on which the columns will be dropped?

For example: I'd like to drop the columns that contain (in any record, in any cell) any of the following special characters:

listOfSpecialCharacters: ¬,`,!,",£,$,£,#,/,\

Can you provide the text version of your dataset so that I can match the answer with the same data? Also, minor detail, but did you want to include `_` as character to blacklist? — mozway, Apr 06 '22 at 08:47
Ah, never mind ! I have sorted it ! I forgot to use .str! thanks !! — Giampaolo Levorato, Apr 06 '22 at 10:21

score 1 · Accepted Answer · answered Apr 06 '22 at 08:29

1

One option is to use a regex with str.contains and apply, then use boolean indexing to drop the columns:

import re
chars = '¬`!"£$£#/\\'
regex = f'[{"".join(map(re.escape, chars))}]'
# '[¬`!"£\\$£\\#/\\\\]'

df2 = df.loc[:, ~df.apply(lambda c: c.str.contains(regex).any())]

example:

# input
     A    B    C
0  123  12!  123
1  abc  abc  a¬b

# output
     A
0  123
1  abc

answered Apr 06 '22 at 08:29

mozway

194,879
13
39
75

Thanks! I get this error: AttributeError: Can only use .str accessor with string values! Looks like that code applies only to string columns. – Giampaolo Levorato Apr 06 '22 at 10:16
You can either do `c.astype(str).str.contains(regex).any()` of apply it only on the str/object columns using [`select_dtypes`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.select_dtypes.html) – mozway Apr 06 '22 at 10:56

How to drop pandas dataframe columns containing special characters

1 Answers1