Including offset in poisson models double number of candidate models

Question

Revised question and code (2020-07-10)

This question has undertaken several edits as the analysis has progressed. I've included the most recent information here at the top, but for those who are stuck at other steps, please see the information below.

Recap: I'm modelling the relationships between several continuous variables (A, B & C), a categorical variable (YN), and a random intercept (ID). I've included an offset (T) because the count data was collected during behavioural observations of different durations. There's also a term for zero-inflation based on the random intercept. I'm comparing all subsets of this model to determine the best-fit models.

At present, the dredge function is throwing an error when I attempt to compare all subsets.

#simulated data
df <- data.frame(
  count = sample(size = 300, c(0:6), replace = TRUE, prob = c(0.7, 0.15, 0.05, 0.04, 0.03, 0.02, 0.01)),
  A = sample(size = 300, x = 24:65, replace = TRUE),
  B = runif(n = 300, min = 7, max = 19),
  C = rbeta(n = 300, shape1 = 1, shape2 = 25)*100,
  YN = sample(size = 300, c("Y", "N"), replace = TRUE, prob = c(0.2, 0.8)),
  T = runif(n = 300, min = 10, max = 20),
  ID = sample(size = 300, letters, replace = TRUE)
)
  
#standardize explanatory variables
dfSTD <- stdize(df, 
                omit.cols = c("count", "YN", "T", "ID"), 
                center = TRUE, scale = TRUE) 

#specify full model with standardized covariates for analysis
full <- stdizeFit(glmmTMB(count ~ z.A + z.B + z.C + YN + offset(log(T)) + (1|ID),
                          zi = ~ (1|ID),
                          family = poisson(),
                          data = dfSTD,
                          na.action = "na.fail"),
                  data = dfSTD, 
                  which = "formula", 
                  evaluate = FALSE)

#compare all possible model subsets
all <- dredge(full, 
              beta = "sd", 
              fixed = "cond(offset(log(T)))")

Error in nobs.default(global.model) : no 'nobs' method is available

I'm working in R version 3.6.1.

I'm modelling the relationships between several continuous variables (A, B & C), a categorical variable (YN), and a random intercept (ID). I've included an offset (T) because the count data was collected during behavioural observations of different durations. There's also a term for zero-inflation based on the random intercept.

I'm comparing all subsets of this model to determine the best-fit models.

My code looks like the following:

full <- glmmTMB(count ~ A + B + C + YN + offset(log(T)) + (1|ID), 
                zi = ~ (1|ID),
                family = poisson(),
                data = df, 
                na.action = "na.fail")
View(dredge(full, beta = sd))

But when I look at the table, it seems like the models are duplicated, i.e., with and without the offset. The table includes 32 models, and when I leave out the offset, there are only 16 models. Plus the coefficients are very similar. (subset of this table shown)

I've tried changing my code so that the offset is always included like below, but this leads to warnings.

    full <- glmmTMB(count ~ A + B + C + YN + (1|ID), 
                    offset = log(T), 
                    zi = ~ (1|ID),
                    family = poisson(),
                    data = df, 
                    na.action = "na.fail")
Warning messages:
1: In (function (start, objective, gradient = NULL, hessian = NULL,  :
  NA/NaN function evaluation
repeated 9 times

EDIT 2020/06/16

I've updated this post with the full model summary when the offset is included as a term:

 Family: poisson  ( log )
Formula:          count ~ A + B + C + YN + offset(log(T)) +  
    (1 | ID)
Zero inflation:      ~(1 | ID)
Data: df

     AIC      BIC   logLik deviance df.resid 
   286.4    317.4   -135.2    270.4      344 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 ID   (Intercept) 0.1006   0.3172  
Number of obs: 352, groups:  name, 64

Zero-inflation model:
 Groups Name        Variance  Std.Dev. 
 ID   (Intercept) 2.089e-08 0.0001445
Number of obs: 352, groups:  name, 64

Conditional model:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -4.582134   0.382926 -11.966  < 2e-16 ***
A              0.003517   0.170809   0.021  0.98357    
B              0.356596   0.162703   2.192  0.02840 *  
C              0.594737   0.095198   6.247 4.17e-10 ***
YN             -1.397989   0.538510  -2.596  0.00943 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Zero-inflation model:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.4744     0.7323  -0.648    0.517

And when the offset included as a separate parameter outside of the formula:

Family: poisson  ( log )
Formula:          count ~ A + B + C + YN + (1 | ID) #NOTE: offset isn't listed here despite being included when I specified the model above
Zero inflation:      ~(1 | ID)
Data: df
 Offset: log(duration)

     AIC      BIC   logLik deviance df.resid 
   286.4    317.4   -135.2    270.4      344 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 ID   (Intercept) 0.1006   0.3172  
Number of obs: 352, groups:  name, 64

Zero-inflation model:
 Groups Name        Variance  Std.Dev. 
 ID   (Intercept) 2.089e-08 0.0001445
Number of obs: 352, groups:  name, 64

Conditional model:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -4.582134   0.382926 -11.966  < 2e-16 ***
A              0.003517   0.170809   0.021  0.98357    
B              0.356596   0.162703   2.192  0.02840 *  
C              0.594737   0.095198   6.247 4.17e-10 ***
YN             -1.397989   0.538510  -2.596  0.00943 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Zero-inflation model:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.4744     0.7323  -0.648    0.517

EDIT 2020/06/24

A revised dredge function addresses the issue with retaining the offset variable in all models.

View(dredge(full, 
            beta = "sd"), #note sd was not in quotes above!
            fixed = "cond(offset(log(time)))") #retains offset in all models

However, this has led to a new issue:

Warning message:
In dredge(full, beta = "sd", fixed = "cond(offset(log(time)))") :
  do not know how to standardize coefficients of 'glmmTMB', argument 'beta' ignored

@Kamil Barton suggests below that I need to use strike and stdizeFit to address this warning message. But as of yet, I am unable to get this code to work without another error message. Furthermore, I'm concerned that standardizing my response variables (as suggested below) is the wrong way to go, and since standardizing the time variable results in negative values, this can no longer be logged in the model (produces NA values).

Regardless, here's my attempt at implementing Kamil's suggested approach using a nearly identical approach as that within the help file:

dfSTD <- stdize(select(df, counts, A, B, C, YN, T, ID),
                center = TRUE, scale = TRUE) 

full <- stdizeFit(glmmTMB(z.counts ~ z.A + z.B + z.C + z.YN + 
                              offset(log(z.T)) + (1|ID),
                          zi = ~ (1|ID),
                          family = poisson(),
                          data = dfSTD,
                          na.action = "na.fail"),
                   which = "formula", evaluate = FALSE)

Error in stdizeFit(glmmTMB(z.counts ~ z.A + z.B + z.C + z.YN + : 
  argument "data" is missing, with no default

Any thoughts on addressing these latest errors either Kamil's suggested approach or what I've done above would be greatly appreciated.

EDIT 2020/07/10

I have been able to address the above issue with the following chunk of code:

dfSTD <- stdize(df, omit.cols = c("counts", "YN", "T", "ID"), 
                center = TRUE, scale = TRUE) 

full <- stdizeFit(glmmTMB(counts ~ z.A + z.B + z.C + YN + 
                                  offset(log(T)) + (1|ID),
                          zi = ~ (1|ID),
                          family = poisson(),
                          data = dfSTD,
                          na.action = "na.fail"),
                  data = dfSTD, which = "formula", evaluate = FALSE)

However, now when I use the dredge function, I have another error.

allEVCounts <- dredge(full, 
                      beta = "sd", 
                      fixed = "cond(offset(log(T)))")

Error in nobs.default(global.model) : no 'nobs' method is available

Any help in determining how and where to address the 'nobs' method would be appreciated.

Maybe you could provide the standard summary output? That might be more helpful than a self compiled Excel table. — Daniel, Jun 15 '20 at 05:42
@Daniel that is what the table looks like when you use View() and dredge(). Are you looking for the summary from a single model? The table shows the similarities in the models with and without the offset, but really the offset should be included in every model. — tnt, Jun 16 '20 at 04:13
The "raw" summary might help to find out whether it's related to model fitting or to the `dredge()` function. I'm not sure what `dregde()` does with glmmTMB models here. — Daniel, Jun 16 '20 at 07:05
@Daniel I've added the summaries above. Hopefully this provide you with more clarity. My goal is to simply compare all subsets of these models, but all models must include the offset term. — tnt, Jun 17 '20 at 04:46
@BenBolker I've edited the question so that the most recent information is at the top. You should be able to run the chunk of code that I've provided (with a simulated dataset) and reproduce the error. — tnt, Jul 10 '20 at 23:24

score 1 · Answer 1 · answered Jun 19 '20 at 09:00

1

dredge treats the offset term in the same way as fixed model terms, when it is included in the formula. To keep the offset in all models, add an argument fixed = "<offset term name>", which in this case is offset(log(T)):

dredge(full, beta = "sd", fixed = "cond(offset(log(T)))")

Note that in your code beta = sd (without quotes) is treated as beta = "none" because you pass a function sd not a character string "sd" and that is not an accepted value.

The warnings about "NA/NaN function evaluation" come from glmmTMB, some sub-models may not converge with the offset. Set options(warn = 1) (immediate warning messages) and use dredge with trace = TRUE to find out which ones.

answered Jun 19 '20 at 09:00

Kamil Bartoń

1,482
9
10

Barton Thanks for that answer, and for catching my sd term. However, when I include sd in quotes, I get the following warning message: In dredge(full, beta = "sd", fixed = "cond(offset(log(T)))") : do not know how to standardize coefficients of 'glmmTMB', argument 'beta' ignored. any thoughts on fixing this? – tnt Jun 19 '20 at 16:44
@tnt that woud require implementing a function `std.coef` to standardize model coefficients. The default does not work as `glmmTMB` has three types of coefficients. – Kamil Bartoń Jun 20 '20 at 15:49
I'm very confused. Prior to running any model, I always scale my coefficients: (x-mean)/sd. Does this not work with glmmTMB? how is std.coef used? I've tried A2 <- std.coef(A), but then I get a different error (Error: $ operator is invalid for atomic vectors) – tnt Jun 22 '20 at 17:12
@tnt currently `std.coef` does not work with `glmmTMB` objects. You can, however, standardize the data (e.g. with `stdize`), followed by `stdizeFit` on the model. – Kamil Bartoń Jun 23 '20 at 15:55
Here's where I'm at now: 1) standardize coefficients: df <- df %>% mutate(Ac = stdize(A, center = TRUE, scale = TRUE), Bc = stdize(B, center = TRUE, scale = TRUE), Cc = stdize(C, center = TRUE, scale = TRUE)) – tnt Jun 23 '20 at 17:28
then 2) model: full <- stdizeFit(glmmTMB(count ~ A + B + C + YN + (1|ID), offset = log(T), zi = ~ (1|ID), family = poisson(), data = df, na.action = "na.fail"), which = "formula") – tnt Jun 23 '20 at 17:37
but I'm still getting an error message: Error in stdizeFit(glmmTMB(count ~ Ac + Bc + Cc + YN + : argument "data" is missing, with no default. this is despite specifying the data in the model. – tnt Jun 23 '20 at 17:37
Best if you look at the examples in `?stdize`. Generally this works like this: `standardized.data.frame <- stdize(your.data.frame, omit.cols = c(...)) stdizeFit(your.model, data = standardized.data.frame)` – Kamil Bartoń Jun 24 '20 at 15:24
with that approach, then I'm not just standardizing the continuous variables, but also my categorical variable, offset, and response variable. Can you provide a justification for that? the response variable seems particularly weird to me. – tnt Jun 24 '20 at 22:42
even when I do all the standardization you suggest, I am still getting the same error as above (Error in stdizeFit(glmmTMB(z.count ~ z.A + z.B + z.C + z.YN + : argument "data" is missing, with no default). For the record, the data file is specified in my model. – tnt Jun 24 '20 at 22:50
and another issue: standardizing the offset (time) results in negative numbers which produce NA's when logged and included in the model. So I really don't think EVERY term in this model should be standardized. – tnt Jun 24 '20 at 23:16
(1) The argument `omit.cols` does just that - give the names of columns you do not want to be modified. (2) `stdizeFit` expects the new (standardized) `data` to be provided. Just follow the example code from my previous comment. – Kamil Bartoń Jun 26 '20 at 14:30
I now have another error message when trying to use dredge on the full model: Error in nobs.default(global.model) : no 'nobs' method is available – tnt Jul 03 '20 at 20:50
@tnt The `nobs` method is definitely defined for `glmmTMB` objects, so check whether the `` is what you expect it to be. – Kamil Bartoń Jul 06 '20 at 06:52
not sure that I'm following. in the help file for glmmTMB, I don't see any parameter to specify for glmmTMB. furthermore when I run my model without the stdizeFit component, I don't get an error message. – tnt Jul 10 '20 at 20:16
@BenBolker see the revised question above. – tnt Jul 21 '20 at 18:42

score 0 · Answer 2 · answered Nov 03 '21 at 10:05

0

The solution, from Kamil Bartoń, is: in stdizeFit, set evaluate=TRUE , otherwise it produces a call rather then a fitted model object, which is not what you (and dredge) would expect.

answered Nov 03 '21 at 10:05

Lena

43
4

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 03 '21 at 10:12

Including offset in poisson models double number of candidate models

2 Answers2