I am trying to determine the influence of five categorical and one continuous independent variable on some ecological count data using a GLM in R. Here is an example of what the data that I am using looks like:
No. DateNum Tunnel Trt Row Time AvgTemp sqTotal.L
1 44382 1 A 3 AM 30.0 1.41
2 44384 3 C 2 PM 21.0 2.23
3 44384 7 D 3 AM 24.0 3.65
4 44400 4 B 1 AM 27.5 2.78
The fixed effects DateNum, Tunnel and Row are coded as ordered factors, Trt and Time are unordered factors and AvgTemp is coded as a numeric. 'sqTotal.L' is the squart-root-transformed count data, which is a normally distributed response variable. I have decided to use GLM instead of ANOVA because the experimental design is not balanced and there are different numbers of samples from the different experimental plots.
When I run the following code for a GLM and then the drop1() function on the resulting model, the effect of my continuous fixed effect (AvgTemp) appears not be incorporated into the result of the drop1() function:
> rowfvs.glm <- glm(sqTotal.L ~ AvgTemp + Row + DateNum + Time + Tunnel + Trt,
+ family = gaussian, data = rowfvs2)
> summary(rowfvs.glm)
Call:
glm(formula = sqTotal.L ~ AvgTemp + Row + DateNum + Time + Tunnel +
Trt, family = gaussian, data = rowfvs2)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.63548 -0.38868 0.06587 0.41777 1.31886
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.89037 5.98768 1.485 0.1492
AvgTemp -0.28191 0.24566 -1.148 0.2612
Row.L -0.46085 0.24735 -1.863 0.0734 .
Row.Q 0.08047 0.25153 0.320 0.7515
DateNum.L -1.17448 0.85015 -1.382 0.1785
DateNum.Q 0.57857 0.64731 0.894 0.3793
DateNum.C -2.17331 2.15684 -1.008 0.3226
DateNum^4 -0.76025 1.09723 -0.693 0.4943
DateNum^5 -1.62269 0.68388 -2.373 0.0250 *
DateNum^6 0.63799 0.70822 0.901 0.3756
DateNum^7 NA NA NA NA
TimePM -0.31436 0.87881 -0.358 0.7233
Tunnel.L 1.38420 0.62199 2.225 0.0346 *
Tunnel.Q -0.03521 0.56561 -0.062 0.9508
Tunnel.C 0.81639 0.54880 1.488 0.1484
Tunnel^4 0.24029 0.61180 0.393 0.6976
Tunnel^5 0.30665 0.51849 0.591 0.5592
Tunnel^6 0.67603 0.53728 1.258 0.2191
TrtB 0.10067 0.40771 0.247 0.8068
TrtC 0.31278 0.41048 0.762 0.4527
TrtD -0.49857 0.46461 -1.073 0.2927
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 0.7583716)
Null deviance: 50.340 on 46 degrees of freedom
Residual deviance: 20.476 on 27 degrees of freedom
AIC: 136.33
Number of Fisher Scoring iterations: 2
> drop1(rowfvs.glm, test = "Chi")
Single term deletions
Model:
sqTotal.L ~ AvgTemp + Row + DateNum + Time + Tunnel + Trt
Df Deviance AIC scaled dev. Pr(>Chi)
<none> 20.476 136.33
AvgTemp 0 20.476 136.33 0.0000
Row 2 23.128 138.05 5.7249 0.05713 .
DateNum 6 25.517 134.67 10.3447 0.11087
Time 1 20.573 134.55 0.2222 0.63736
Tunnel 6 27.525 138.23 13.9039 0.03073 *
Trt 3 23.201 136.20 5.8725 0.11798
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
By contrast, when I try running the anova() function on the model, I do get an analysis of the influence of AvgTemp on sqTotal.L:
> anova(rowfvs.glm, test = "Chi")
Analysis of Deviance Table
Model: gaussian, link: identity
Response: sqTotal.L
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 46 50.340
AvgTemp 1 0.7885 45 49.551 0.3078945
Row 2 1.0141 43 48.537 0.5124277
DateNum 6 17.6585 37 30.879 0.0007065 ***
Time 1 0.3552 36 30.523 0.4937536
Tunnel 6 7.3222 30 23.201 0.1399428
Trt 3 2.7251 27 20.476 0.3088504
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
So, my questions are why isn't the drop1() function taking AvgTemp into account, and is it sufficient to use the results from the anova() function in my report of the results, or do I need to figure out how to get the drop1() function to incorporate this continuous predictor?