I am analyzing the Poisson regression of a data count. Poisson requires that the variance and mean is equal, so I am checking the dispersion to ensure this. For the dispersion I am using two methods:
- dispersiontest() by AER package.
- check the dispersion modeling as a negative binomial with (glm.nb)
> pm <- glm(myCounts ~ campaign, d, family = poisson)
> summary(pm)
Call:
glm(formula = myCounts ~ campaign, family = poisson, data = d)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.074 -1.599 -0.251 1.636 6.399
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.955870 0.032174 154.03 <2e-16 ***
campaign -0.025879 0.001716 -15.08 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 428.04 on 35 degrees of freedom
Residual deviance: 195.81 on 34 degrees of freedom
AIC: 426.37
Number of Fisher Scoring iterations: 4
> dispersiontest(pm)
Overdispersion test
data: pm
z = 3.1933, p-value = 0.0007032
alternative hypothesis: true dispersion is greater than 1
sample estimates:
dispersion
5.53987
> # Calculate dispersion with Negative Binomial
> nb_reg <- glm.nb(myCounts ~ campaign, data=d)
> summary.glm(nb_reg)
Call:
glm.nb(formula = myCounts ~ campaign, data = d, init.theta = 22.0750109,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9235 -0.7083 -0.1776 0.6707 2.4495
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.914728 0.082327 59.697 < 2e-16 ***
campaign -0.023471 0.003965 -5.919 1.1e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(22.075) family taken to be 1.069362)
Null deviance: 76.887 on 35 degrees of freedom
Residual deviance: 35.534 on 34 degrees of freedom
AIC: 325.76
Number of Fisher Scoring iterations: 1
As you can see, NB provides a 1.069362 dispersion. However, dispersiontest() results on 5.5 with clear overdispersion. If I am not wrong AER is not a parametric test, so we can only know if there is a over/under-dispersion or not. Nevertheless, both methods contradict.
Do somebody know why?