Filter pandas column by two criteria

Question

df = pd.DataFrame(np.random.randint(0,100,size=(15, 3)), columns=list('NMO'))
df['Category1'] = ['I','I','I','I','I','G','G','G','G','G','P','P','I','I','P']
df['Category2'] = ['W','W','C','C','C','W','W','W','W','W','O','O','O','O','O']

If I wanted to do a t-test on this data, based on both categories, how would I refer to the categories?

If I was doing the test on one category it would look like:

ttest_ind(
    df[df['Category1']=='P']['N'], 
    df[df['Category1']=='I']['N'])

but what if I wanted to compare data of numbers that have both I and W? I tried this, but it doesn't work.

ttest_ind(
df[[df['Category1']=='G'] and [df['Category2']=='W']]['N'], 
df[[df['Category1']=='I'] and [df['Category2']=='W']]['N'])

Does this answer your question? [Pandas: Filtering multiple conditions](https://stackoverflow.com/questions/48978550/pandas-filtering-multiple-conditions) — Eduardo Motta de Moraes, Jan 12 '23 at 22:21

Joshua Voskamp · Answer 1 · 2023-01-12T22:20:46.500

Change

df[[df['Category1']=='G'] and [df['Category2']=='W']]['N']

to

df[(df['Category1']=='G') & (df['Category2']=='W')]['N']

and similarly for the 'I'/'W'/'N' line.

and evaluates an expression for "truthiness", while & (with numpy/pandas) is shorthand for np.logical_and which computes the element-wise truth value of two boolean vectors (what you want here). Also, put the expressions in parentheses instead of square brackets. Effectively

# won't work
df[ [boolean series] and [boolean series] ] -> df[ True ]

# vs

# will work
df[ (boolean series) & (boolean series) ] -> df[ 'logical-and'ed boolean series ]

Filter pandas column by two criteria

1 Answers1