How to interpret the output of scipy.stats.ttest_ind?

Question

I have two sets of noisy samples - I want to determine whether they are substantively different or not. I plan to do this using a 2 sided t-test for their means and looking at the p-value.

Previous answers (e.g. How to calculate the statistics "t-test" with numpy) have recommended using ttest_ind from scipy - i.e. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

But I don't understand how to interpret those results.

If you see the examples, the p-value for the case in which the random values have the same mean is 0.78849443369564776

>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500)
>>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500)
>>> stats.ttest_ind(rvs1,rvs2)
(0.26833823296239279, 0.78849443369564776)

and the p-value for the case in which the random values have different means and standard deviations is 0.34744170334794122.

>>> rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)
>>> stats.ttest_ind(rvs1, rvs5)
(-1.4679669854490653, 0.14263895620529152)
>>> stats.ttest_ind(rvs1, rvs5, equal_var = False)
(-0.94365973617132992, 0.34744170334794122)

It seems like we never get a p-value below 0.1 and reject the hypothesis, even in the case where the rv is clearly drawn from a distribution with a different mean.

There must be something obvious that I am missing here but after much RTFMing, I can't figure out what it is...

Warren Weckesser · Accepted Answer · 2018-03-30T14:48:33.713

Your samples rvs1 and rvs5 overlap a lot. Take a look at their histograms:

In [83]: import numpy as np

In [84]: import matplotlib.pyplot as plt

In [85]: from scipy import stats

In [86]: np.random.seed(12345)

In [87]: rvs1 = stats.norm.rvs(loc=5, scale=10, size=500)

In [88]: rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)

Histograms:

In [91]: plt.hist(rvs1, bins=15, color='c', edgecolor='k', alpha=0.5)
Out[91]: 
(array([ 11.,   8.,  23.,  59.,  70.,  80.,  76.,  75.,  47.,  29.,  15.,
          3.,   1.,   2.,   1.]),
 array([-21.4440949 , -17.06280322, -12.68151153,  -8.30021984,
         -3.91892815,   0.46236353,   4.84365522,   9.22494691,
         13.6062386 ,  17.98753028,  22.36882197,  26.75011366,
         31.13140535,  35.51269703,  39.89398872,  44.27528041]),
 <a list of 15 Patch objects>)

In [92]: plt.hist(rvs5, bins=15, color='g', edgecolor='k', alpha=0.5)
Out[92]: 
(array([  1.,   0.,   0.,   2.,   5.,  10.,  15.,  11.,  16.,  18.,   9.,
          4.,   3.,   4.,   2.]),
 array([-50.98686996, -43.98675863, -36.98664729, -29.98653596,
        -22.98642462, -15.98631329,  -8.98620195,  -1.98609062,
          5.01402071,  12.01413205,  19.01424338,  26.01435472,
         33.01446605,  40.01457739,  47.01468872,  54.01480006]),
 <a list of 15 Patch objects>)

In this case, the p-value is about 0.16:

In [93]: stats.ttest_ind(rvs1, rvs5, equal_var=False)
Out[93]: Ttest_indResult(statistic=-1.4255662967967209, pvalue=0.15678343609588596)

If you make the scales smaller, or increase difference of the mean values of the distributions from which you draw the samples, you'll see that the p-value gets small pretty quick. For example,

In [110]: np.random.seed(12345)

In [111]: rvsa = stats.norm.rvs(loc=5, scale=4, size=500)

In [112]: rvsb = stats.norm.rvs(loc=8, scale=6.5, size=100)

In [113]: stats.ttest_ind(rvsa, rvsb, equal_var=False)
Out[113]: Ttest_indResult(statistic=-4.6900889904607572, pvalue=7.3811906412170361e-06)

You'll also see lower p-values if you increase the sizes of the samples. For example, here I increased the sizes of rvs1 and rvs5 to 2000 and 1000, respectively, and the p-value is about 4e-6:

In [120]: np.random.seed(12345)

In [121]: rvs1 = stats.norm.rvs(loc=5, scale=10, size=2000)

In [122]: rvs5 = stats.norm.rvs(loc=8, scale=20, size=1000)

In [123]: stats.ttest_ind(rvs1, rvs5, equal_var=False)
Out[123]: Ttest_indResult(statistic=-4.6093457457907219, pvalue=4.4518966751259737e-06)

Those sample rvs are directly from the scipy documentation. You would think that they would use example values that would illustrate the operation of the function better, but apparently not. Maybe I will submit a doc PR if I have time... — Shankari, Mar 30 '18 at 16:44
what does the statistic value in the result mean? For example statistic=-4.6093457457907219 in the last example. — Sergio Polimante, May 06 '22 at 14:15

How to interpret the output of scipy.stats.ttest_ind?

1 Answers1