I am comparing stats.ttest_ind() vs "manual" computation of the same test, and get different results.
import numpy as np
import pandas as pd
import scipy.stats as stats
import math
stats.ttest_ind() method:
#generate data
np.random.seed(123)
df = pd.DataFrame({
'age':np.random.normal(40,5,200).round(),
'sex':np.random.choice( ['male', 'female'], 200, p=[0.4, 0.6]),
})
#define groups
men = df.age[df.sex == 'male']
women = df.age[df.sex == 'female']
#run t-test
test_stat, test_p = stats.ttest_ind(men, women)
print(test_stat, test_p)
Out:
-0.9265613940505325 0.355282312357339
Manual method:
#mean
men_mean, women_mean = men.mean(), women.mean()
#standard deviation
men_sd, women_sd = men.std(ddof=1), women.std(ddof=1)
#standard error
men_n, women_n = len(men), len(women)
men_se, women_se = men_sd/math.sqrt(men_n), women_sd/math.sqrt(women_n)
#standard error on the difference between men and women
se_diff = math.sqrt(men_se**2.0 + women_se**2.0)
#t-stat
t_stat = (men_mean - women_mean) / se_diff
#degrees of freedom
df = men_n + women_n - 2
#critical value
alpha = 0.05
cv = stats.t.ppf(1.0 - alpha, df)
# p-value
p = (1 - stats.t.cdf(abs(t_stat), df)) * 2
print(t_stat, cv, p)
Out:
-0.9244538916746341 0.3563753194455255
We can see there's a small difference. Why? Maybe because of how stats.ttest_ind() computes degrees of freedom? Any insight much appreciated.