I have discrete count data (trap_catch) for two groups withing the variable in_tree (1 = trap in a tree, or 0 = trap not in a tree), and I want to see if counts were different between these two groups. The data is overdispersed and there are many zeroes, so I have come to the conclusion that I need a hurdle model. Is this OK?
trap_id trap_catch in_tree
1 0 0
2 10 1
3 0 0
4 0 1
5 9 1
6 3 0
Here is an example of how the data is set up. My code is as follows:
mod.hurdle <- hurdle(trap_catch~in_tree, data=data,dist="negbin")
summary(mod.hurdle)
The results I get are as follows and seem so different to any examples I have read:
Pearson residuals:
Min 1Q Median 3Q Max
-0.8986 -0.6635 -0.2080 0.2474 6.8513
Count model coefficients (truncated negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.2582 0.1285 9.793 < 2e-16 ***
in_tree 1.3722 0.3100 4.426 9.58e-06 ***
Log(theta) -0.2056 0.2674 -0.769 0.442
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.5647 0.1944 8.049 8.32e-16 ***
in_tree 16.0014 1684.1379 0.010 0.992
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta: count = 0.8142
Number of iterations in BFGS optimization: 8
Log-likelihood: -513.7 on 5 Df
I am confused as to how to interpret these results.
I apologise in advance for my lack of understanding - I am very new to this type of analysis.