0

I want to subset the column index of a data frame, by keeping all of the columns that do NOT start with a certain string. Right now I can filter the column index for columns that start with a certain string, but I'd like to do the opposite and exclude column names that match a certain criterion.

If trying to keep the columns

If trying to keep column names that start with FR, I use this code, which works.

factor_list = df2.filter(like = 'FR_', axis=1).columns

Trying to exclude columns that start with "FR", I've tried many things including the following:

factor_list = df2.filter(like != 'FR_', axis=1).columns
factor_list = ~df2.filter(like = 'FR_', axis=1).columns
WON_Eric
  • 11
  • 3

3 Answers3

0

You were quite close with df.filter. In this case it's easier to use a regular expression by filtering all the columns which don't start (?!^) with FR_:

df.filter(regex='^(?!FR_)')

   Col1  Col3
0     2     8
1     0     7
2     4     1
3     2     6
4     4     8
# Example dataframe
df = pd.DataFrame({'Col1':np.random.randint(0,10,5),
                   'FR_Col2':np.random.randint(0,10,5),
                   'Col3':np.random.randint(0,10,5),
                   'FR_Col4':np.random.randint(0,10,5)})

   Col1  FR_Col2  Col3  FR_Col4
0     2        1     8        5
1     0        7     7        9
2     4        8     1        9
3     2        2     6        9
4     4        9     8        3
Erfan
  • 40,971
  • 8
  • 66
  • 78
0

This should work for you. Please check the above link for more details.

factor_list = df2.loc[:,~df.columns.str.startswith('FR')]
Vishwas
  • 343
  • 2
  • 13
0

You can do

factor_list = df2.filter(regex = '^(?!FR_).*', axis=1).columns
Akaisteph7
  • 5,034
  • 2
  • 20
  • 43