2

I'm using R to perform some statistical analysis and I have some problems in interpreting the model. consider the following data:

df <- data.frame( richness=c(9,13,10,12,11,5,6,8,9,10,10,8, 
                 5,7,6,9,5,6,7,8,4,10,5,8, 
                 4,5,7,5,6,7,4,5,5,6,6,6, 
                 1,0,2,1,4,5,3,2,0,1,4,4),
        condition=c(rep("A",24), rep("B",24)), 
        area = c(12.62, 11.07, 15.72, 15.41, 6.42, 19.13, 17.58, 19.44, 13.55, 18.20, 6.73, 14.79, 5.80, 14.48, 17.89,  7.66, 10.76,  8.90, 8.59, 12.00, 12.93,  7.04, 17.27, 16.34,  9.83,  9.52, 19.75, 10.14, 13.86, 12.31, 16.03, 11.38, 14.17, 15.10, 18.51,  9.21, 20.06, 20.37,  7.97,  7.35,  8.28, 16.65,  6.11, 18.82, 10.45, 16.96, 11.69, 13.24),
        treatment=rep(c(rep("absent",12), rep("present",12)), 2)) 

library(MASS)
nb.fit <- glm.nb( richness ~ condition * treatment, data=df)
exp(coef(nb.fit))

by doing the exp(coef(nb.fit)) we can calculate averages from each combination. For the sake of simplicity, let's just use the intercept; so, the mean richness for A:absent is 9.25 species. which can be obtained (verified) by:

mean(subset(df, condition=="A" & treatment=="absent")$richness)

now, consider that we have different sampling sizes, meaning we have different effort. We should (probably must) account for this issue. So, let's put it as an offset in the model..

nb.fit2 <- glm.nb(richness ~ condition * treatment + offset(log(area)),data=df)

now look at the model coefficients:

exp(coef(nb.fit2))

Question 1: is it correct to interpret that the mean richness/area is 0.69 species for A:absent? (in this case 0.69 species per square meter)

Question 2: How is offset computed? for example, I was not able to manually obtain the averages as I did in the first case with mean(subset(df, condition=="A" & treatment=="absent")$richness) even when trying to remove the effect of the area with

df$richness.area <- df$richness/df$area
mean(subset(df, condition=="A" & treatment=="absent")$richness.area)

the average for A:absent is not the same. I also tried to remove the effect using the residuals of the area and the values were again not the same.

Does anyone know the math involved?

Thanks

Raf1987
  • 29
  • 2
  • I think this belongs to CrossValidated. See e.g. https://stats.stackexchange.com/questions/11182/when-to-use-an-offset-in-a-poisson-regression for a related question and answer... – coffeinjunky May 02 '17 at 15:04

0 Answers0