0

t-test between two numbers (0.85,0.18) getting p-value as Nan along with the following error:

Getting this error also _,p_value=stats.ttest_ind(a=Max,b=Max_1,equal_var=False) C:\Users\NehaBhakat\Anaconda31\lib\site-packages\numpy\core\fromnumeric.py:3584: RuntimeWarning: Degrees of freedom <= 0 for slice **kwargs) C:\Users\NehaBhakat\Anaconda31\lib\site-packages\scipy\stats_distn_infrastructure.py:903: RuntimeWarning: invalid value encountered in greater return (a < x) & (x < b) C:\Users\NehaBhakat\Anaconda31\lib\site-packages\scipy\stats_distn_infrastructure.py:903: RuntimeWarning: invalid value encountered in less return (a < x) & (x < b) C:\Users\NehaBhakat\Anaconda31\lib\site-packages\scipy\stats_distn_infrastructure.py:1912: RuntimeWarning: invalid value encountered in less_equal cond2 = cond0 & (x <= _a)

Grayrigel
  • 3,474
  • 5
  • 14
  • 32

2 Answers2

2

A t-test is for finding out whether two distributions are in fact coming from the same population. You cannot test for two single values. Hence, getting NaN is correct.

A distribution means, that you have a vector with values that you measured. To have a meaningful t-test, you should usually have at least 30 values.

drops
  • 1,524
  • 1
  • 11
  • 20
0

scipy.stats.ttest_ind() runs a t-test on two samples, testing the null hypothesis that 2 independent samples have identical average (expected) values . It expects you to pass the two samples (group a, group b) as arrays of all the observations, so it can calculate the pooled standard deviation. See the formula it uses below (from Wikipedia)

two sample t test formula

To calculate the standard deviation we need the difference of every point from the mean, that's why it's asking you for the whole array of data. The Scipy documentation explains what the function expects as well:

a, b array_like The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).

tania
  • 2,104
  • 10
  • 18