I'm comparing two multilevel models in R using the anova() function. One model contains a control variable and another with an experimental variable. When I compare the two, I get a weird result where the chisquare is 0 and the p value is 1. I would interpret this as the models are not significantly different, but this doesn't make sense with the data and other analyses that I've done with this experimental variable. Can someone help me understand this output?
To explain the variables, block_order (control) is the counterbalancing of the questions. It's a factor with 5 levels. team_num is a level 2 random effect; it's the participant's team that they belong to. cent_team_wm_agg is the team's desire to maintain a healthy weight. It is a continuous variable. exer_vig is the continuous dependent variable, and it is how often people exercise.
Here's the model comparison output that has me confused:
anova(m2_ev_full_team, m1_ev_control_block_team)
refitting model(s) with ML (instead of REML)
Data: clean_data_0_nona
Models:
m2_ev_full_team: exer_vig ~ 1 + cent_team_wm_agg + (1 | team_num)
m1_ev_control_block_team: exer_vig ~ 1 + block_order + (1 | team_num)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m2_ev_full_team 4 523.75 536.27 -257.88 515.75
m1_ev_control_block_team 8 533.96 559.00 -258.98 517.96 0 4 1
In case this helps, here are the models themselves. This is the one with the experimental variable:
summary(m2_ev_full_team <- lmer(exer_vig ~ 1 + cent_team_wm_agg + (1 |team_num), data = clean_data_0_nona))
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: exer_vig ~ 1 + cent_team_wm_agg + (1 | team_num)
Data: clean_data_0_nona
REML criterion at convergence: 519.7
Scaled residuals:
Min 1Q Median 3Q Max
-1.7585 -0.5819 -0.2432 0.5531 2.5569
Random effects:
Groups Name Variance Std.Dev.
team_num (Intercept) 0.1004 0.3168
Residual 1.1628 1.0783
Number of obs: 169, groups: team_num, 58
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 2.65955 0.09478 42.39962 28.061 <2e-16 ***
cent_team_wm_agg 0.73291 0.23572 64.27148 3.109 0.0028 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
cnt_tm_wm_g -0.004
And the one with the control:
summary(m1_ev_control_block_team <- lmer(exer_vig ~ 1 + block_order + (1 |team_num), data = clean_data_0_nona))
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: exer_vig ~ 1 + block_order + (1 | team_num)
Data: clean_data_0_nona
REML criterion at convergence: 525.1
Scaled residuals:
Min 1Q Median 3Q Max
-1.6796 -0.6597 -0.1625 0.5291 2.0941
Random effects:
Groups Name Variance Std.Dev.
team_num (Intercept) 0.2499 0.4999
Residual 1.1003 1.0490
Number of obs: 169, groups: team_num, 58
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 3.0874 0.2513 155.4960 12.284 <2e-16 ***
block_orderBlock2|Block4|Block3 -0.2568 0.3057 154.8652 -0.840 0.4020
block_orderBlock3|Block2|Block4 -0.3036 0.3438 160.8279 -0.883 0.3785
block_orderBlock3|Block4|Block2 -0.6204 0.3225 161.5186 -1.924 0.0561 .
block_orderBlock4|Block2|Block3 -0.4215 0.3081 151.2908 -1.368 0.1733
block_orderBlock4|Block3|Block2 -0.7306 0.3178 156.5548 -2.299 0.0228 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) b_B2|B b_B3|B2 b_B3|B4 b_B4|B2
bl_B2|B4|B3 -0.757
bl_B3|B2|B4 -0.687 0.557
bl_B3|B4|B2 -0.733 0.585 0.543
bl_B4|B2|B3 -0.741 0.601 0.545 0.577
bl_B4|B3|B2 -0.734 0.586 0.535 0.561 0.575
EDIT: If I had to guess, I assume it's because the control model has more degrees of freedom than the experimental, but that's all I can think of. I've tried running anova with the order of the models flipped, but it doesn't change anything. If that's the case, I don't know why the number of dfs would make a difference in being able to compare which one is better.
Thank you!