6

I want to select a subset of rows in a pandas dataframe, based on a particular string column, where the value starts with any number of values in a list.

A small version of this:

df = pd.DataFrame({'a': ['aa10', 'aa11', 'bb13', 'cc14']})
valids = ['aa', 'bb']

So I want just those rows where a starts with aa or bb in this case.

Hanshan
  • 3,656
  • 5
  • 29
  • 36

1 Answers1

9

You need startswith

df.a.str.startswith(tuple(valids))
Out[191]: 
0     True
1     True
2     True
3    False
Name: a, dtype: bool

After filter with original df

df[df.a.str.startswith(tuple(valids))]
Out[192]: 
      a
0  aa10
1  aa11
2  bb13
BENY
  • 317,841
  • 20
  • 164
  • 234