I have two sets of noisy samples - I want to determine whether they are substantively different or not. I plan to do this using a 2 sided t-test for their means and looking at the p-value.
Previous answers (e.g. How to calculate the statistics "t-test" with numpy) have recommended using ttest_ind
from scipy
- i.e.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html
But I don't understand how to interpret those results.
If you see the examples, the p-value for the case in which the random values have the same mean is 0.78849443369564776
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500)
>>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500)
>>> stats.ttest_ind(rvs1,rvs2)
(0.26833823296239279, 0.78849443369564776)
and the p-value for the case in which the random values have different means and standard deviations is 0.34744170334794122
.
>>> rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)
>>> stats.ttest_ind(rvs1, rvs5)
(-1.4679669854490653, 0.14263895620529152)
>>> stats.ttest_ind(rvs1, rvs5, equal_var = False)
(-0.94365973617132992, 0.34744170334794122)
It seems like we never get a p-value below 0.1
and reject the hypothesis, even in the case where the rv is clearly drawn from a distribution with a different mean.
There must be something obvious that I am missing here but after much RTFMing, I can't figure out what it is...