Implementation of E-test for Poisson in Python

Question

Is there a Python implementation of the E-Test for Poissons? For Binomials scipy has the Fisher's Exact test as stats.fisher_exact and for Gaussians scipy.stats has Welch's T-test as ttest_ind. I can't seem to find any Python implementation for the comparison of two Poissons.

For context look here

For the algorithm look here

For R implementation look here

I have never seen it in python. Statsmodels GLM-poisson or Poisson provides a test based on the asymptotic distribution. I looked recently into C, E tests and similar for rates and proportions. The E-test for Poisson in the reference looks like the analogous score-tests. https://github.com/statsmodels/statsmodels/issues/2607 — Josef, Nov 28 '15 at 18:38
I want to do a one tailed test on two samples to test for one sample being from a population with a larger mean. ie a standard A/B testing based on p-values. I am quite sure that an E-test would be ideal but for the moment I would be quite happy to find any Poisson test that is analogous to the tests I list above. Please answer with a code snippet. — Keith, Nov 29 '15 at 21:29

Josef · Accepted Answer · 2022-01-09T18:03:12.997

update These functions are now included in statsmodels https://www.statsmodels.org/dev/generated/statsmodels.stats.rates.test_poisson_2indep.html
with accompanying version for TOST equivalence testing https://www.statsmodels.org/dev/generated/statsmodels.stats.rates.tost_poisson_2indep.html

Here is a start. This implements three tests of Gu et. al 2008 based on the asymptotic normal distribution, and now also two conditional test based on the exact distribution.

The score test works reasonably well if the counts are not too small (say larger than 10 or 20), and the exposure times are not very unequal. For smaller counts the results can be either a bit conservative or liberal, and other methods would be better. The version 'sqrt' did very well in their simulations, but might have a bit less power than the score test when that one works.

'''Test for ratio of Poisson intensities in two independent samples

Author: Josef Perktold
License: BSD-3

destination statsmodels

'''


from __future__ import division
import numpy as np
from scipy import stats


# copied from statsmodels.stats.weightstats
def _zstat_generic2(value, std_diff, alternative):
    '''generic (normal) z-test to save typing

    can be used as ztest based on summary statistics
    '''
    zstat = value / std_diff
    if alternative in ['two-sided', '2-sided', '2s']:
        pvalue = stats.norm.sf(np.abs(zstat))*2
    elif alternative in ['larger', 'l']:
        pvalue = stats.norm.sf(zstat)
    elif alternative in ['smaller', 's']:
        pvalue = stats.norm.cdf(zstat)
    else:
        raise ValueError('invalid alternative')
    return zstat, pvalue


def poisson_twosample(count1, exposure1, count2, exposure2, ratio_null=1,
                      method='score', alternative='2-sided'):
    '''test for ratio of two sample Poisson intensities

    If the two Poisson rates are g1 and g2, then the Null hypothesis is

    H0: g1 / g2 = ratio_null

    against one of the following alternatives

    H1_2-sided: g1 / g2 != ratio_null
    H1_larger: g1 / g2 > ratio_null
    H1_smaller: g1 / g2 < ratio_null

    Parameters
    ----------
    count1: int
        Number of events in first sample
    exposure1: float
        Total exposure (time * subjects) in first sample
    count2: int
        Number of events in first sample
    exposure2: float
        Total exposure (time * subjects) in first sample
    ratio: float
        ratio of the two Poisson rates under the Null hypothesis. Default is 1.
    method: string
        Method for the test statistic and the p-value. Defaults to `'score'`.
        Current Methods are based on Gu et. al 2008
        Implemented are 'wald', 'score' and 'sqrt' based asymptotic normal
        distribution, and the exact conditional test 'exact-cond', and its mid-point
        version 'cond-midp', see Notes
    alternative : string
        The alternative hypothesis, H1, has to be one of the following

           'two-sided': H1: ratio of rates is not equal to ratio_null (default)
           'larger' :   H1: ratio of rates is larger than ratio_null
           'smaller' :  H1: ratio of rates is smaller than ratio_null

    Returns
    -------
    stat, pvalue two-sided

    not yet
    #results : Results instance
    #    The resulting test statistics and p-values are available as attributes.


    Notes
    -----
    'wald': method W1A, wald test, variance based on separate estimates
    'score': method W2A, score test, variance based on estimate under Null
    'wald-log': W3A
    'score-log' W4A
    'sqrt': W5A, based on variance stabilizing square root transformation
    'exact-cond': exact conditional test based on binomial distribution
    'cond-midp': midpoint-pvalue of exact conditional test

    The latter two are only verified for one-sided example.

    References
    ----------
    Gu, Ng, Tang, Schucany 2008: Testing the Ratio of Two Poisson Rates,
    Biometrical Journal 50 (2008) 2, 2008

    '''

    # shortcut names
    y1, n1, y2, n2 = count1, exposure1, count2, exposure2
    d = n2 / n1
    r = ratio_null
    r_d = r / d

    if method in ['score']:
        stat = (y1 - y2 * r_d) / np.sqrt((y1 + y2) * r_d)
        dist = 'normal'
    elif method in ['wald']:
        stat = (y1 - y2 * r_d) / np.sqrt(y1 + y2 * r_d**2)
        dist = 'normal'
    elif method in ['sqrt']:
        stat = 2 * (np.sqrt(y1 + 3 / 8.) - np.sqrt((y2 + 3 / 8.) * r_d))
        stat /= np.sqrt(1 + r_d)
        dist = 'normal'
    elif method in ['exact-cond', 'cond-midp']:
        from statsmodels.stats import proportion
        bp = r_d / (1 + r_d)
        y_total = y1 + y2
        stat = None
        pvalue = proportion.binom_test(y1, y_total, prop=bp, alternative=alternative)
        if method in ['cond-midp']:
            # not inplace in case we still want binom pvalue
            pvalue = pvalue - 0.5 * stats.binom.pmf(y1, y_total, bp)

        dist = 'binomial'

    if dist == 'normal':
        return _zstat_generic2(stat, 1, alternative)
    else:
        return stat, pvalue


from numpy.testing import assert_allclose

# testing against two examples in Gu et al

print('\ntwo-sided')
# example 1
count1, n1, count2, n2 = 60, 51477.5, 30, 54308.7

s1, pv1 = poisson_twosample(count1, n1, count2, n2, method='wald')
pv1r = 0.000356
assert_allclose(pv1, pv1r*2, rtol=0, atol=5e-6)
print('wald', s1, pv1 / 2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='score')
pv2r = 0.000316
assert_allclose(pv2, pv2r*2, rtol=0, atol=5e-6)
print('score', s2, pv2 / 2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='sqrt')
pv2r = 0.000285
assert_allclose(pv2, pv2r*2, rtol=0, atol=5e-6)
print('sqrt', s2, pv2 / 2)   # one sided in the "right" direction

print('\ntwo-sided')
# example2
# I don't know why it's only 2.5 decimal agreement, rounding?
count1, n1, count2, n2 = 41, 28010, 15, 19017
s1, pv1 = poisson_twosample(count1, n1, count2, n2, method='wald', ratio_null=1.5)
pv1r = 0.2309
assert_allclose(pv1, pv1r*2, rtol=0, atol=5e-3)
print('wald', s1, pv1 / 2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='score', ratio_null=1.5)
pv2r = 0.2398
assert_allclose(pv2, pv2r*2, rtol=0, atol=5e-3)
print('score', s2, pv2 / 2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='sqrt', ratio_null=1.5)
pv2r = 0.2499
assert_allclose(pv2, pv2r*2, rtol=0, atol=5e-3)
print('sqrt', s2, pv2 / 2)   # one sided in the "right" direction

print('\none-sided')
# example 1 onesided
count1, n1, count2, n2 = 60, 51477.5, 30, 54308.7

s1, pv1 = poisson_twosample(count1, n1, count2, n2, method='wald', alternative='larger')
pv1r = 0.000356
assert_allclose(pv1, pv1r, rtol=0, atol=5e-6)
print('wald', s1, pv1)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='score', alternative='larger')
pv2r = 0.000316
assert_allclose(pv2, pv2r, rtol=0, atol=5e-6)
print('score', s2, pv2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='sqrt', alternative='larger')
pv2r = 0.000285
assert_allclose(pv2, pv2r, rtol=0, atol=5e-6)
print('sqrt', s2, pv2)   # one sided in the "right" direction

# 'exact-cond', 'cond-midp'

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='exact-cond',
                            ratio_null=1, alternative='larger')
pv2r = 0.000428  # typo in Gu et al, switched pvalues between C and M
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('exact-cond', s2, pv2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='cond-midp',
                            ratio_null=1, alternative='larger')
pv2r = 0.000310
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('cond-midp', s2, pv2)   # one sided in the "right" direction


print('\none-sided')
# example2 onesided
# I don't know why it's only 2.5 decimal agreement, rounding?
count1, n1, count2, n2 = 41, 28010, 15, 19017
s1, pv1 = poisson_twosample(count1, n1, count2, n2, method='wald',
                            ratio_null=1.5, alternative='larger')
pv1r = 0.2309
assert_allclose(pv1, pv1r, rtol=0, atol=5e-4)
print('wald', s1, pv1)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='score',
                            ratio_null=1.5, alternative='larger')
pv2r = 0.2398
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('score', s2, pv2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='sqrt',
                            ratio_null=1.5, alternative='larger')
pv2r = 0.2499
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('score', s2, pv2)   # one sided in the "right" direction

# 'exact-cond', 'cond-midp'

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='exact-cond',
                            ratio_null=1.5, alternative='larger')
pv2r = 0.2913
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('exact-cond', s2, pv2)   # one sided in the "right" direction

s2, pv2 = poisson_twosample(count1, n1, count2, n2, method='cond-midp',
                            ratio_null=1.5, alternative='larger')
pv2r = 0.2450
assert_allclose(pv2, pv2r, rtol=0, atol=5e-4)
print('cond-midp', s2, pv2)   # one sided in the "right" direction

This prints

two-sided /2
wald 3.38491255626 0.000356004664253
score 3.417401839 0.000316109441024
sqrt 3.44548501956 0.00028501778109

two-sided /2
wald 0.73544663636 0.231033764105
score 0.706630933035 0.239897930348
sqrt 0.674401392575 0.250028078819

one-sided
wald 3.38491255626 0.000356004664253
score 3.417401839 0.000316109441024
sqrt 3.44548501956 0.00028501778109

one-sided
wald 0.73544663636 0.231033764105
score 0.706630933035 0.239897930348
score 0.674401392575 0.250028078819

The exact conditional test would be relatively easy to implement but is very conservative and has low power. The approximately exact tests will require a bit more effort (for which I will not have time at the moment).

(As often: The actual calculation are a few lines. Deciding on interface, adding documentation and unit tests is more work.)

EDIT

The above script now also includes the exact conditional test and it's midpoint p-value version, checked with the two examples for one-sided alternative in Gu et al.

Example 1: one-sided

exact-cond None 0.00042805269405
cond-midp None 0.000310132441983

Example 2: one-sided

exact-cond None 0.291453753765
cond-midp None 0.245173718501

currently there is no test statistic for the conditional tests returned

My counts are rarely great than 20 and the exposure time are equal to machine precision. I think I will go with sqrt method for now. I would prefer the exact method because I want to be conservative. This will be a good starting point in any case. — Keith, Dec 01 '15 at 22:38
I don't understand what it means to have exposure time equal to machine precision. The Poisson rate will be relative to whatever the unit of exposure is. Working at machine precision could blow up numerical noise and it might be better to change the exposure unit. — Josef, Dec 02 '15 at 00:29
Conditional exact tests look easy, I just tried and will add conditional and mid-p conditional soon. Power in small samples (counts) will be large only for large differences in the Poisson rates. — Josef, Dec 02 '15 at 00:30
I added exact conditional tests, but will not have time for exact or approximately exact unconditional tests, because they require some numerical experiments in how to calculate and truncate the double infinite Poisson sums in a robust and efficient way. — Josef, Dec 02 '15 at 01:15
I mean that the time for the number of events is 7 days to machine precision. So absurdly accurate. — Keith, Dec 02 '15 at 01:33
@Keith I opened a pull request with a draft version of the E-test https://github.com/statsmodels/statsmodels/pull/2723/files#diff-8d9af3a01c72248b06370f02ea4a88b7R136 It's just a standalone function right now and is not optimized for numerically efficient calculation. It matches the numbers in Gu et al and should work well for small counts, where asymptotic test might have problems. — Josef, Dec 05 '15 at 01:06
I am still pondering one thing the underlying exact method. In short one assumes that the test is many Bernoulli trials one for each user for each unit time and then use the binomial test. The choice of time unit does not really matter as it scales out. However I am a bit uncomfortable with this assumption. In my case most users do not succeed any in trials, but if they do they succeed in about a dozen. I feel that this method loses some information. Is there a particular variant that would work best for this. Am I better off fitting to a Weibull function which is closer to the shape? — Keith, Dec 06 '15 at 20:27
@Keith The identical independent poisson distribution assumption is very strong, and violations are almost never mentioned in any of the articles I looked at. If you have overdispersion overall, i.e. variance is larger than the mean, then it is possible to use exact Negative Binomial. Standard Generalized Linear Models are robust to variance violations but we need to adjust the covariance of the parameter estimates. Unfortunately, so far the only robust hypothesis test in statsmodels is Wald. Likelihood ratio test is possible with simple over or under dispersion. — Josef, Dec 06 '15 at 22:39
Also, your comment sounds like you have too many zeros and a zero-inflated Poisson might be the more appropriate distributional assumption. — Josef, Dec 06 '15 at 22:41
I think dispersion is a good way to classify the issue. I have tried fitting to a Poisson and a negative binomial but neither were great. For now using what you have provided gets me a reasonable enough answer to move forward with the work but I feel I could do better. The particular case is the number of transaction events for a customer of an e-commerce company. As mentioned ~99% do not purchase ever. The distribution looks somewhat exponential but the tail is clearly larger. This is what got me rethinking the problem as a Weibull analysis. I might add a new question to Cross validated. — Keith, Dec 07 '15 at 00:55

score 4 · Answer 2 · answered Mar 24 '19 at 23:00

4

I wrapped Krishnamoorthy's fortran code, based on the published paper, using numpy bindings and packaged it up. Source code is on github.

Install via

pip install poisson-etest

Usage

from poisson_etest import poisson_etest

sample1_k, sample1_n = 10, 20
sample2_k, sample2_n = 15, 20
prob = poisson_etest(sample1_k, sample2_k, sample1_n, sample2_n)

answered Mar 24 '19 at 23:00

Nolan Conaway

2,639
1
26
42

1

Might be worth trying to get it into a standard package like scikitlearn but they may not like that it is a wrapping of fortran code. – Keith Mar 25 '19 at 06:17
1

there are also a few numeric problems with the fortran code. if this were cleaned up it might be a good contrib to scipy.stats – Nolan Conaway Mar 25 '19 at 12:56

Implementation of E-test for Poisson in Python

2 Answers2