Background: I am fitting a logistic mixed model with two fixed effects, and two crossed random effects. The two variables making up the fixed effects are both binary categorical variables (i.e., there are no numeric predictors).
The data are from an eye tracker, and each subject has thousands of 0s and 1s over time. One model I wanted to consider is a logistic mixed model, accounting for correlation of the 0s and 1s within an individual, and within an image an individual is looking at at a given time (over the entire experiment, they look at six different images, out of 12 possible). (As an aside, I do plan to also analyze these data with a growth curve analysis, but was also curious to try this, ignoring the time course temporarily.)
Issue: I ran the model using dummy coding of both binary predictors, with the default levels of baseline for each. The model converged. It also made sense based on plots of the data I have already made: estimates were in in the expected directions, significance of fixed effects was as expected, etc.
Then, I wanted to fit the model reparameterized with one of the predictors at its other level of baseline in order to get the associated p-value (see, e.g.: https://stats.stackexchange.com/questions/449723/calculating-simple-effects-from-standard-regression-output). I used relevel()
to do this, then fit the exact same model again. However, this time there was a convergence warning. I tried it again (because why not), and it failed again. I cannot come up with a reason why the baseline level should affect convergence (or lack thereof).
Data structure: Here is a small dummy dataset showing the general structure of the data (much reduced from the actual dataset, which has ~5000 observations (0s and 1s) per individual per image, and is over 1 million lines long in total (there are ~60 subjects in the actual dataset)). Note Time_stamp is not in the model here, as I am not considering the time course in this analysis. "AOI" in AOI_look stands for "area of interest," a common term in eye tracking data.
structure(list(Subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Image = c("A", "A",
"A", "B", "B", "B", "C", "C", "C", "D", "D", "D", "E", "E", "E",
"F", "F", "F", "G", "G", "G", "E", "E", "E", "D", "D", "D", "A",
"A", "A", "J", "J", "J", "L", "L", "L"), Time_stamp = c(0L, 1L,
10L, 0L, 8L, 16L, 0L, 7L, 16L, 0L, 1L, 10L, 0L, 8L, 16L, 0L,
7L, 16L, 0L, 8L, 16L, 0L, 7L, 16L, 0L, 1L, 9L, 0L, 8L, 16L, 0L,
8L, 16L, 0L, 1L, 9L), Intervention = c("Pre", "Pre", "Pre", "Pre",
"Pre", "Pre", "Pre", "Pre", "Pre", "Post", "Post", "Post", "Post",
"Post", "Post", "Post", "Post", "Post", "Pre", "Pre", "Pre",
"Pre", "Pre", "Pre", "Pre", "Pre", "Pre", "Post", "Post", "Post",
"Post", "Post", "Post", "Post", "Post", "Post"), Sex = c("Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male"), AOI_look = c(1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L,
1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 1L,
1L, 1L, 1L)), class = "data.frame", row.names = c(NA, -36L))
Model 1: Intervention = Post as baseline: No convergence warning
logit_model01 <- glmer(AOI_look ~ factor(Sex)*factor(Intervention) +
(1 | Image) +
(1 | Subject),
data = data01,
family="binomial")
Output from the Model 1 (converged):
summary(logit_model01)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
Family: binomial ( logit )
Formula: AOI_look ~ factor(Sex) * factor(Intervention) + (1 | Image) +
(1 | Subject)
Data: data01
AIC BIC logLik deviance df.resid
1519023.6 1519095.5 -759505.8 1519011.6 1182007
Scaled residuals:
Min 1Q Median 3Q Max
-2.6200 -0.9733 0.5504 0.8444 2.2317
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 0.2320 0.4816
Image (Intercept) 0.1318 0.3631
Number of obs: 1182013, groups: Subject, 59; Image, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.052133 0.036172 1.441 0.15
factor(Sex)Male 0.187736 0.039967 4.697 2.64e-06 ***
factor(Intervention)Pre 0.257262 0.006136 41.924 < 2e-16 ***
factor(Sex)Male:factor(Intervention)Pre -0.220240 0.007932 -27.766 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) fc(S)M fc(I)P
fctr(Sex)M -0.006
factr(Int)Pr 0.011 0.012
f(S)M:(I)P -0.013 -0.018 -0.747
Model 2: Same as Model 1, but Intervention = Pre was set to baseline before running: Get convergence warning:
logit_model02 <- glmer(AOI_look ~ factor(Sex)*factor(Intervention) +
(1 | Image) +
(1 | Subject),
data = data01,
family="binomial")
Convergence warning:
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00210935 (tol = 0.002, component 1)
Although there was a warning, I am still able to get the output for Model 2 with summary()
:
summary(logit_model02)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
Family: binomial ( logit )
Formula: AOI_look ~ factor(Sex) * factor(Intervention) + (1 | Image) +
(1 | Subject)
Data: data01
AIC BIC logLik deviance df.resid
1519023.6 1519095.5 -759505.8 1519011.6 1182007
Scaled residuals:
Min 1Q Median 3Q Max
-2.6200 -0.9733 0.5504 0.8444 2.2317
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 0.2320 0.4816
Image (Intercept) 0.1318 0.3631
Number of obs: 1182013, groups: Subject, 59; Image, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.309400 0.038106 8.120 4.68e-16 ***
factor(Sex)Male -0.032522 0.042520 -0.765 0.444
factor(Intervention)Post -0.257262 0.006158 -41.774 < 2e-16 ***
factor(Sex)Male:factor(Intervention)Post 0.220234 0.007926 27.787 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) fc(S)M fc(I)P
fctr(Sex)M -0.107
fctr(Int)Pst 0.009 0.024
f(S)M:(I)P -0.013 -0.025 -0.748
convergence code: 0
Model failed to converge with max|grad| = 0.00210935 (tol = 0.002, component 1)
Added information found in comments below Ben Bolker's answer, for clarity/completeness:
Model 3: Same as Model 1, but on aggregated data (0s and 1s summarized into successes and failures by Subject and Image combination):
Results are not the same as from Model 1 (fit on non-aggregated, 0/1 data), even though Model 1 had no convergence warning. Namely, all point estimates are similar, but not so for all SEs: The SEs for the interaction and Intervention (within-subjects) are similar, but those for Sex (between-subjects) and the intercept are markedly different:
logit_model03 <- glmer(cbind(Successes, Failures) ~ factor(Sex)*factor(Intervention) +
(1 | Image) +
(1 | Subject),
data = data01_agg,
family="binomial")
summary(logit_model03)
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.052151 0.150564 0.346 0.729
factor(Sex)Male 0.187734 0.125563 1.495 0.135
factor(Intervention)Pre 0.257261 0.006227 41.314 <2e-16 ***
factor(Sex)Male:factor(Intervention)Pre -0.220239 0.008061 -27.322 <2e-16 ***
To see if the sheer size of the raw dataset (over 1.67 million lines) could be causing an issue, I took a random sample of 15000 observations from my original dataset and refit the model with both the non-aggregated (0/1) and aggregated (success, failure) data. (I did this for both levels of baseline for Intervention, but am just showing that for Intervention = Post here.):
Model 4: Random sample of 15000 obs. from actual dataset, non-aggregated (0/1) outcome, Intervention = Post as baseline:
logit_model04 <- glmer(AOI_look ~ factor(Sex)*factor(Intervention) +
(1 | Image) +
(1 | Subject),
data = smaller_raw_data,
family="binomial")
summary(logit_model04)
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.03212 0.15564 -0.206 0.836505
factor(Sex)Male 0.28384 0.14121 2.010 0.044429 *
factor(Intervention)Pre 0.22727 0.06449 3.524 0.000425 ***
factor(Sex)Male:factor(Intervention)Pre -0.25061 0.08410 -2.980 0.002884 **
Model 5: Random sample of 15000 obs. from actual dataset, aggregated (success, failure) outcome, Intervention = Post as baseline:
logit_model05 <- glmer(cbind(Successes, Failures) ~ factor(Sex)*factor(Intervention) +
(1 | Image) +
(1 | Subject),
data = smaller_agg_data,
family="binomial")
summary(logit_model05)
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.03211 0.15569 -0.206 0.836582
factor(Sex)Male 0.28383 0.14126 2.009 0.044500 *
factor(Intervention)Pre 0.22727 0.06450 3.524 0.000425 ***
factor(Sex)Male:factor(Intervention)Pre -0.25061 0.08412 -2.979 0.002890 **
As you can see, the point estimates and SEs for the fixed effects for models 4 and 5 are the same until at least the 5th decimal place. I also repeated this for Intervention = Pre as baseline, and the results are equivalent with regard to their similarity across the non-aggregated and aggregated models (as I expected would be the case).
Thus, using non-aggregated or aggregated data shouldn't matter in theory, at least for a reasonably-sized (i.e., not-too-large) dataset. This leads me to wonder if there is instability in the models when the dataset gets "too large." From ?lme4::convergence
, there is reason to believe this could be the case: “Exploratory analyses suggest that (1) the naive estimation of the Hessian may fail for large data sets (number of observations greater than approximately 1e5).” So maybe this is the root of what I observed in models 1-3 above.
In my case, I have the option to aggregate the data since both of my predictors are binary categorical, but for numeric predictors that don't allow aggregating, this would not be an option to reduce the computational load prior to fitting the model.