-1

I am checking if Set A is better than Set B. Excel says it is at 95%.

Excel I am using =T.TEST(Set A, Set B, 1, 1), p-value = 0.019914

Python, this code gives me , p = 0.90 or not significant based on the function below.

What is correct?

# Calculate the t-statistic and p-value

t_stat, p_value = stats.ttest_ind(Set A, Set B , alternative='greater')

I am getting different answers when I try python code vs. excel code.

You can use any data and it will yield different p-value..what am I doing wrong here?

from scipy import stats

# Set A
Set A = [0.9988, 1.0000, 0.3213, 0.4863, 0.6606, 0.0409, 0.9996, 0.8822, 1.0000, 0.9374,0.9999, 0.0080, 0.9983, 0.0000, 0.0000, 0.0000, 0.5000, 0.0602, 0.9999, 1.0000, 1.0000, 1.0000, 1.0000, 0.0000, 0.9985, 0.8261, 0.2058, 0.3295, 0.6029, 0.0504]
# Set B
Set B = [0.9995, 1.0000, 1.0000, 1.0000, 1.0000, 0.0000, 0.9993, 0.9381, 0.6929, 0.7971,0.8464, 0.0220, 0.9979, 0.8584, 0.7538, 0.8027, 0.8768, 0.0231, 0.9990, 0.8611,0.6294, 0.7273, 0.8146, 0.0294, 0.9992, 0.8466, 0.7284, 0.7831, 0.8641, 0.0252]

# Calculate the t-statistic and p-value
t_stat, p_value = stats.ttest_ind(Set A, Set B, alternative='greater')

# Print results
print("T-statistic:", t_stat)
print("P-value:", p_value)
Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19
b t
  • 1
  • 1

1 Answers1

0

In Excel you are trying to perform a paired t-test. If you want instead to perform a t-test for independent samples of equal variance, you should use =T.TEST(Set A, Set B, 1, 2).

In scipy there are two different functions for paired and independent t-test:

from scipy import stats

set_A = [0.9988, 1.0000, 0.3213, 0.4863, 0.6606, 0.0409, 0.9996, 0.8822, 1.0000, 0.9374,0.9999, 0.0080, 0.9983, 0.0000, 0.0000, 0.0000, 0.5000, 0.0602, 0.9999, 1.0000, 1.0000, 1.0000, 1.0000, 0.0000, 0.9985, 0.8261, 0.2058, 0.3295, 0.6029, 0.0504]

set_B = [0.9995, 1.0000, 1.0000, 1.0000, 1.0000, 0.0000, 0.9993, 0.9381, 0.6929, 0.7971,0.8464, 0.0220, 0.9979, 0.8584, 0.7538, 0.8027, 0.8768, 0.0231, 0.9990, 0.8611,0.6294, 0.7273, 0.8146, 0.0294, 0.9992, 0.8466, 0.7284, 0.7831, 0.8641, 0.0252]


t_stat_ind, p_value_ind = stats.ttest_ind(set_A, set_B, alternative='greater')
t_stat_paired, p_value_paired = stats.ttest_rel(set_A, set_B, alternative='greater')

print(p_value_ind)
print(p_value_paired)

# Outputs
# 0.9086667470962503
# 0.9800819602785336
Ignatius Reilly
  • 1,594
  • 2
  • 6
  • 15
  • thank you. Do I interpret p_value_paired as there is less than 2% (100-98%) chance that null hypothesis can be rejected? – b t Jul 03 '23 at 03:19
  • I see. Less than 2% chance that NULL is true? Why does Python do it the other way versus excel? Is it always the case. When you do it in excel you get 0.019... – b t Jul 05 '23 at 23:36