t test getting nan output for both

Question

Every t test I run outputs nan for both statistic and p value I have checked my dataframes and they look fine. Does anyone know what's happening? Thanks in advance!

e_tr.groupby('Rest Periods')['Wages and Hours'].mean()

#t test
cat1 = e_tr[e_tr['Rest Periods']==0]
cat2 = e_tr[e_tr['Rest Periods']==1]
# cat1['Wages and Hours'].value_counts()
sp.stats.ttest_ind(cat1.dropna()['Rest Periods'], cat2.dropna()['Rest Periods'])
ttest_ind(cat1['Wages and Hours'], cat2['Wages and Hours'])

Output: Ttest_indResult(statistic=nan, pvalue=nan)

The question is not fully transparent, see [here](https://stackoverflow.com/help/minimal-reproducible-example) for more information. E.g. what is `e_tr` here and what libraries you're using? It may only be obvious for you. — colidyre, Apr 08 '20 at 10:24

tianlinhe · Answer 1 · 2020-04-08T13:11:33.633

That is possibly because your test and contro in column 'Wages and Hours' still contain np.nan. Try to clean your data first:

e_tr=e_tr[e_tr['Wages and Hours'].notnull()]

Then assign case and control as you did:

cat1 = e_tr[e_tr['Rest Periods']==0]
cat2 = e_tr[e_tr['Rest Periods']==1]

So now if you run:

ttest_ind(cat1['Wages and Hours'], cat2['Wages and Hours'])

Should give your the anticipated statistics.

t test getting nan output for both

1 Answers1