Why scipy.stats.ttest_ind provides different results for the same input ? (not idempotent)

Question

I'm trying to run a T-Test to check if there is a significant difference between two samples for a given KPI. I'm running this python code:

population_control = 18917
population_treatment = 169996
stddev_control = 3.7944261452602888
stddev_treatment = 3.8521668798017057
avg_control = 2.906
avg_treatment = 2.921

import scipy.stats
rvs1 = scipy.stats.norm.rvs(loc=avg_control,scale=stddev_control,size=population_control)
rvs2 = scipy.stats.norm.rvs(loc=avg_treatment,scale=stddev_treatment,size=population_treatment)
t_score, pvalue = scipy.stats.ttest_ind(rvs1, rvs2, equal_var = False)

print(pvalue)

But I don't understand why the output changes from one execution to another, for the same input info. Sometimes I have a p-value < 0.05 (significant), sometimes it's way higher.

Also, when I put a np.random.seed(12345678) , I always have the same p-value, but it makes me doubt about what I'm doing.

Do you have any idea in mind ? Thanks a lot.

`scipy.stats.norm.rvs` generates random values. If you don't set the seed, those values will be different each time the program is run. — Warren Weckesser, Jul 20 '20 at 13:50
@WarrenWeckesser yes but in that case, what does this method proves ? Because depending on the seed I can end up with completely different p-values and conclusions — DKK, Jul 20 '20 at 13:54
*"Because depending on the seed I can end up with completely different p-values..."* Yes, that exactly what one would expect with different data sets. Based on your answer, it looks like you have figured out the problem and a solution. — Warren Weckesser, Jul 20 '20 at 13:59

score 0 · Answer 1 · answered Jul 20 '20 at 13:53

I'm answering my own question, but i'd like to have your opinion on that.

It seems the "problem" comes from the scipy.stats.norm.rvs, which is trying to create distribution samples from the average, variance and popsize i'm giving him. It seems the sample generation is random, that's why we get different pvalues at the end.

Apparently, for my usecase it's better to use ttest_ind_from_stats , and with this i have a fixed pvalue

Why scipy.stats.ttest_ind provides different results for the same input ? (not idempotent)

1 Answers1