binom_test results compared to hypothesis test with normally distributed test statistic

Question

I have two samples with different numbers of trials in each sample, and different number of successes in each sample. I'm trying to compare the success rate between the two samples to see if there is a significant difference. I seem to get very different results depending on if I use binom_test from scipy.stats or the function below which assumes that the test statistic is normally distributed.

can someone please tell me if I'm applying the binom_test incorrectly, or if there's an error/I'm incorrectly using the function below?

I got the function from a SO post, it seems like hatP might be incorrect.

I have sample data and the results from binom_test and the function below. binom_test is getting a p-value it basically rounds down to 0 while the function is getting a p-value 1.82 which doesn't even make sense.

function from SO post https://stats.stackexchange.com/questions/81091/is-it-possible-to-do-a-test-of-significance-for-a-string-occurrence-in-two-datas

# 2 sample binom


def fnDiffProp(x1, x2, n1, n2):
    '''
    inputs:
    x1: the number of successes in the first sample
    x2: the number of successes in the second sample
    n1: the total number of 'trials' in the first sample
    n2: the total number of 'trials' in the second sample
    output:
    the test statistic, and the p-value as a tuple
    '''
    
    import math
    import scipy.stats as stats
    
    hatP = (x1 + x2)/(n1 + n2)
    hatQ = 1 - hatP
    hatP1 = x1/n1
    hatP2 = x1/n2
    Z = (hatP1 - hatP2)/(math.sqrt(hatP*hatQ*(1/n1 + 1/n2)))
    pVal = 2*(1 - stats.norm.cdf(Z))
    return((Z, pVal))



sample 1

195 successes
135779 trials

sample 2

5481 successes
81530 trials


results from binom_test

binom_test(x=5481, n=81530, p=0.0014, alternative='greater')

0.0

binom_test(x=5481, n=81530, p=0.0014, alternative='two-sided')
0.0


fnDiffProp(x1=195, x2=5481, n1=135779, n2=81530)

(-1.3523132192521408, 1.82372486268966)

update:

I ran proportions_ztest from statsmodels and got the results below, similar to the results from binom_test. In one of the tests below I randomly sampled equal samples from both groups. in either case the p value was so small it got rounded to 0.

number_of_successes = [5481, 195]
total_sample_sizes = [81530, 135779]
# Calculate z-test statistic and p-value
test_stat, p_value = proportions_ztest(number_of_successes, total_sample_sizes, alternative='larger')

print(str(test_stat))
print(str(p_value))

93.10329278601503
0.0


number_of_successes = [5389, 119]
total_sample_sizes = [80000, 80000]
# Calculate z-test statistic and p-value
test_stat, p_value = proportions_ztest(number_of_successes, total_sample_sizes, alternative='larger')

print(str(test_stat))
print(str(p_value))


72.26377467032772
0.0

You have to compare two binomial distributed values. Not sure why di you want formula for normal distribution. Check https://stats.stackexchange.com/questions/113602/test-if-two-binomial-distributions-are-statistically-different-from-each-other — Yuri Ginsburg, Feb 22 '23 at 01:59
@YuriGinsburg thank you for getting back to me so quickly, I read this post earlier. isn't the difference that the test statistic in this post is (n1*x1+n2*x2)/(n1+n2) and in the fnDiffProp function I posted it's (x1+x2)/(n1+n2). other than that what would the difference be, between the function I posted and what they're suggesting in the url you sent in your comment? also do you know what the difference is between the post and using binom_test? — user3476463, Feb 22 '23 at 02:17
@YuriGinsburg actually I took a closer look, I don't think there's a difference because x1 and x2 in the post are actually the percentages not the number of successes. so once you multiply them by the number of trials you get the same test statistic in both posts. so is the fnDiffProp function basically doing what you're suggesting? — user3476463, Feb 22 '23 at 02:24
`pVal` is probability, so the value `1.82372486268966` looks strange. — Yuri Ginsburg, Feb 22 '23 at 04:29
@YuriGinsburg could it be that the two sample sizes are too different? one sample is almost half the size of the other. — user3476463, Feb 22 '23 at 23:08
IMHO the problem is that you use normal distribution with the mean `0` anfd variance `1`. Mean should be `Z` and formula for the variance is mentioned in the comments of post referred. — Yuri Ginsburg, Feb 23 '23 at 01:49

binom_test results compared to hypothesis test with normally distributed test statistic

0 Answers0