0

I want to use pandas filter to drop columns that contain the string "delta".

Example dataframe:

import pandas as pd
df = pd.DataFrame(dict(x=[1], x_delta=[2]))

I want to drop all the columns containing the string delta. Keep in mind that the dataframe may have many more columns, this has to be general. I'm thinking about using the filter method but I'm not being able to do the negation properly.

Thanks for your help!

This hasn't worked for me:

def not_delta(df):
    """Drop the columns that contain the word delta"""
    return df.filter(regex="(?!delta)")
David Masip
  • 2,146
  • 1
  • 26
  • 46
  • Take a look at this post: https://stackoverflow.com/questions/19071199/drop-columns-whose-name-contains-a-specific-string-from-pandas-dataframe – divyashie Sep 27 '22 at 07:07

5 Answers5

2

Try this...

df = pd.DataFrame({"delta1": [1], "delta2": [2], "sdf": [3]})

col_drop = [col for col in df.columns if "delta" in col]

df1 = df.drop(col_drop,axis=1)

#Output of df1
   sdf
0    3

Hope this Helps...

MarianD
  • 13,096
  • 12
  • 42
  • 54
Sachin Kohli
  • 1,956
  • 1
  • 1
  • 6
2

Yes filter should do, you can use below to remove columns whose name contains 'delta':

df.filter(regex='^((?!delta).)*$', axis=1)
SomeDude
  • 13,876
  • 5
  • 21
  • 44
1

I would do it following way

import pandas as pd
df = pd.DataFrame(dict(x=[1], x_delta=[2]))
todrop = [i for i in df.columns if 'delta' in i]
df.drop(columns=todrop,inplace=True)
print(df)

output

   x
0  1
Daweo
  • 31,313
  • 3
  • 12
  • 25
1
df[  [col   for col in df   if "delta" not in col]  ]

(Extra spaces are only for emphasizing individual parts – a list comprehension and 3 parts in it.)


The explanation:

df itself is an iterable; it iterates over column names: [col for col in df]

Then we add the if "delta" not in col condition into this list comprehension to keep only appropriate columns.

MarianD
  • 13,096
  • 12
  • 42
  • 54
0

You can try something like this:

req_cols = [_col for _col in df.columns if not _col.__contains__('delta')]
df = df[req_cols].copy()
ap14
  • 4,393
  • 1
  • 15
  • 30