Typical setup of columns having categorical values as following:
df = pd.DataFrame([('A', 'buy', 'sub'),
('B', 'sell', 'prior'),
('C', 'hold', 'sub'),
('D', 'loan', 'none'),
('A', 'hold', 'sub'),
('A', 'buy', 'none')], columns=['name', 'action', 'class'])
name action class
1 A buy sub
2 B sell prior
3 C hold sub
4 D loan none
5 A hold sub
6 A buy none
Suppose I want to select rows with multiple conditions, from this thread, I can do this:
#selecting rows with name='A' AND action='buy'
df1 = df[(df['name']=='A') & (df['action']=='buy')]
#selecting rows with name='A' OR class='sub'
df2 = df[(df['name']=='A') | (df['class']=='sub')]
For a dataframe with more columns and more selecting conditions, that operation above gets to look ugly (i.e. long statements of &
and |
). I am looking to do something more compact, like this:
df1 = df[df[['name', 'action']] == ('A', 'buy')]
df2 = df[df[['name', 'class']] *** ('A', 'sub')] *** is hybrid operator, maybe?
Any lazy ways to do this? If no such a way, feel free to suggest to close this question. Thanks.