Revised question and code (2020-07-10)
This question has undertaken several edits as the analysis has progressed. I've included the most recent information here at the top, but for those who are stuck at other steps, please see the information below.
Recap: I'm modelling the relationships between several continuous variables (A, B & C), a categorical variable (YN), and a random intercept (ID). I've included an offset (T) because the count data was collected during behavioural observations of different durations. There's also a term for zero-inflation based on the random intercept. I'm comparing all subsets of this model to determine the best-fit models.
At present, the dredge function is throwing an error when I attempt to compare all subsets.
#simulated data
df <- data.frame(
count = sample(size = 300, c(0:6), replace = TRUE, prob = c(0.7, 0.15, 0.05, 0.04, 0.03, 0.02, 0.01)),
A = sample(size = 300, x = 24:65, replace = TRUE),
B = runif(n = 300, min = 7, max = 19),
C = rbeta(n = 300, shape1 = 1, shape2 = 25)*100,
YN = sample(size = 300, c("Y", "N"), replace = TRUE, prob = c(0.2, 0.8)),
T = runif(n = 300, min = 10, max = 20),
ID = sample(size = 300, letters, replace = TRUE)
)
#standardize explanatory variables
dfSTD <- stdize(df,
omit.cols = c("count", "YN", "T", "ID"),
center = TRUE, scale = TRUE)
#specify full model with standardized covariates for analysis
full <- stdizeFit(glmmTMB(count ~ z.A + z.B + z.C + YN + offset(log(T)) + (1|ID),
zi = ~ (1|ID),
family = poisson(),
data = dfSTD,
na.action = "na.fail"),
data = dfSTD,
which = "formula",
evaluate = FALSE)
#compare all possible model subsets
all <- dredge(full,
beta = "sd",
fixed = "cond(offset(log(T)))")
Error in nobs.default(global.model) : no 'nobs' method is available
I'm working in R version 3.6.1.
I'm modelling the relationships between several continuous variables (A, B & C), a categorical variable (YN), and a random intercept (ID). I've included an offset (T) because the count data was collected during behavioural observations of different durations. There's also a term for zero-inflation based on the random intercept.
I'm comparing all subsets of this model to determine the best-fit models.
My code looks like the following:
full <- glmmTMB(count ~ A + B + C + YN + offset(log(T)) + (1|ID),
zi = ~ (1|ID),
family = poisson(),
data = df,
na.action = "na.fail")
View(dredge(full, beta = sd))
But when I look at the table, it seems like the models are duplicated, i.e., with and without the offset. The table includes 32 models, and when I leave out the offset, there are only 16 models. Plus the coefficients are very similar. (subset of this table shown)
I've tried changing my code so that the offset is always included like below, but this leads to warnings.
full <- glmmTMB(count ~ A + B + C + YN + (1|ID),
offset = log(T),
zi = ~ (1|ID),
family = poisson(),
data = df,
na.action = "na.fail")
Warning messages:
1: In (function (start, objective, gradient = NULL, hessian = NULL, :
NA/NaN function evaluation
repeated 9 times
EDIT 2020/06/16
I've updated this post with the full model summary when the offset is included as a term:
Family: poisson ( log )
Formula: count ~ A + B + C + YN + offset(log(T)) +
(1 | ID)
Zero inflation: ~(1 | ID)
Data: df
AIC BIC logLik deviance df.resid
286.4 317.4 -135.2 270.4 344
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
ID (Intercept) 0.1006 0.3172
Number of obs: 352, groups: name, 64
Zero-inflation model:
Groups Name Variance Std.Dev.
ID (Intercept) 2.089e-08 0.0001445
Number of obs: 352, groups: name, 64
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.582134 0.382926 -11.966 < 2e-16 ***
A 0.003517 0.170809 0.021 0.98357
B 0.356596 0.162703 2.192 0.02840 *
C 0.594737 0.095198 6.247 4.17e-10 ***
YN -1.397989 0.538510 -2.596 0.00943 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Zero-inflation model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.4744 0.7323 -0.648 0.517
And when the offset included as a separate parameter outside of the formula:
Family: poisson ( log )
Formula: count ~ A + B + C + YN + (1 | ID) #NOTE: offset isn't listed here despite being included when I specified the model above
Zero inflation: ~(1 | ID)
Data: df
Offset: log(duration)
AIC BIC logLik deviance df.resid
286.4 317.4 -135.2 270.4 344
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
ID (Intercept) 0.1006 0.3172
Number of obs: 352, groups: name, 64
Zero-inflation model:
Groups Name Variance Std.Dev.
ID (Intercept) 2.089e-08 0.0001445
Number of obs: 352, groups: name, 64
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.582134 0.382926 -11.966 < 2e-16 ***
A 0.003517 0.170809 0.021 0.98357
B 0.356596 0.162703 2.192 0.02840 *
C 0.594737 0.095198 6.247 4.17e-10 ***
YN -1.397989 0.538510 -2.596 0.00943 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Zero-inflation model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.4744 0.7323 -0.648 0.517
EDIT 2020/06/24
A revised dredge function addresses the issue with retaining the offset variable in all models.
View(dredge(full,
beta = "sd"), #note sd was not in quotes above!
fixed = "cond(offset(log(time)))") #retains offset in all models
However, this has led to a new issue:
Warning message:
In dredge(full, beta = "sd", fixed = "cond(offset(log(time)))") :
do not know how to standardize coefficients of 'glmmTMB', argument 'beta' ignored
@Kamil Barton suggests below that I need to use strike and stdizeFit to address this warning message. But as of yet, I am unable to get this code to work without another error message. Furthermore, I'm concerned that standardizing my response variables (as suggested below) is the wrong way to go, and since standardizing the time variable results in negative values, this can no longer be logged in the model (produces NA values).
Regardless, here's my attempt at implementing Kamil's suggested approach using a nearly identical approach as that within the help file:
dfSTD <- stdize(select(df, counts, A, B, C, YN, T, ID),
center = TRUE, scale = TRUE)
full <- stdizeFit(glmmTMB(z.counts ~ z.A + z.B + z.C + z.YN +
offset(log(z.T)) + (1|ID),
zi = ~ (1|ID),
family = poisson(),
data = dfSTD,
na.action = "na.fail"),
which = "formula", evaluate = FALSE)
Error in stdizeFit(glmmTMB(z.counts ~ z.A + z.B + z.C + z.YN + :
argument "data" is missing, with no default
Any thoughts on addressing these latest errors either Kamil's suggested approach or what I've done above would be greatly appreciated.
EDIT 2020/07/10
I have been able to address the above issue with the following chunk of code:
dfSTD <- stdize(df, omit.cols = c("counts", "YN", "T", "ID"),
center = TRUE, scale = TRUE)
full <- stdizeFit(glmmTMB(counts ~ z.A + z.B + z.C + YN +
offset(log(T)) + (1|ID),
zi = ~ (1|ID),
family = poisson(),
data = dfSTD,
na.action = "na.fail"),
data = dfSTD, which = "formula", evaluate = FALSE)
However, now when I use the dredge function, I have another error.
allEVCounts <- dredge(full,
beta = "sd",
fixed = "cond(offset(log(T)))")
Error in nobs.default(global.model) : no 'nobs' method is available
Any help in determining how and where to address the 'nobs' method would be appreciated.