df = pd.DataFrame(np.random.randint(0,100,size=(15, 3)), columns=list('NMO'))
df['Category1'] = ['I','I','I','I','I','G','G','G','G','G','P','P','I','I','P']
df['Category2'] = ['W','W','C','C','C','W','W','W','W','W','O','O','O','O','O']
If I wanted to do a t-test on this data, based on both categories, how would I refer to the categories?
If I was doing the test on one category it would look like:
ttest_ind(
df[df['Category1']=='P']['N'],
df[df['Category1']=='I']['N'])
but what if I wanted to compare data of numbers that have both I and W? I tried this, but it doesn't work.
ttest_ind(
df[[df['Category1']=='G'] and [df['Category2']=='W']]['N'],
df[[df['Category1']=='I'] and [df['Category2']=='W']]['N'])