How to select columns of a data frame not in a list of names in Python

Question

I want to subset the column index of a data frame, by keeping all of the columns that do NOT start with a certain string. Right now I can filter the column index for columns that start with a certain string, but I'd like to do the opposite and exclude column names that match a certain criterion.

If trying to keep the columns

If trying to keep column names that start with FR, I use this code, which works.

factor_list = df2.filter(like = 'FR_', axis=1).columns

Trying to exclude columns that start with "FR", I've tried many things including the following:

factor_list = df2.filter(like != 'FR_', axis=1).columns
factor_list = ~df2.filter(like = 'FR_', axis=1).columns

https://stackoverflow.com/questions/19071199/drop-columns-whose-name-contains-a-specific-string-from-pandas-dataframe — Vishwas, Jul 09 '19 at 21:34

score 0 · Accepted Answer · answered Jul 09 '19 at 21:35

You were quite close with df.filter. In this case it's easier to use a regular expression by filtering all the columns which don't start (?!^) with FR_:

df.filter(regex='^(?!FR_)')

   Col1  Col3
0     2     8
1     0     7
2     4     1
3     2     6
4     4     8

# Example dataframe
df = pd.DataFrame({'Col1':np.random.randint(0,10,5),
                   'FR_Col2':np.random.randint(0,10,5),
                   'Col3':np.random.randint(0,10,5),
                   'FR_Col4':np.random.randint(0,10,5)})

   Col1  FR_Col2  Col3  FR_Col4
0     2        1     8        5
1     0        7     7        9
2     4        8     1        9
3     2        2     6        9
4     4        9     8        3

Thank you for the additional context and explanation. – WON_Eric Jul 10 '19 at 12:45 — WON_Eric, Jul 10 '19 at 12:45

score 0 · Answer 2 · answered Jul 09 '19 at 21:36

0

This should work for you. Please check the above link for more details.

factor_list = df2.loc[:,~df.columns.str.startswith('FR')]

answered Jul 09 '19 at 21:36

Vishwas

343
2
13

score 0 · Answer 3 · answered Jul 09 '19 at 21:37

0

You can do

factor_list = df2.filter(regex = '^(?!FR_).*', axis=1).columns

answered Jul 09 '19 at 21:37

Akaisteph7

5,034
2
20
43

This works because I can just add the code directly into my program. – WON_Eric Jul 10 '19 at 12:44

How to select columns of a data frame not in a list of names in Python

3 Answers3