Running a linear mixed model regression when a zero-sum dependent variable causes near-zero variance in random effects

Question

I just want to preface this by saying I'm very new to R, so I'm sorry if this has been answered else-where but I can't seem to see this situation address anywhere else with my limited theory/technical knowledge.

Context: my data is from 60 participants who were asked to rank order their top 10 friends. Then they were given 100 points each to allocate among those 10 friends. Therefore, all of the points they assigned across the 10 friends had to sum to 100. The prediction I'm trying to test is whether the fit of a regression model is improved by adding a quadratic term, where "points" is the dependent variable, "rank" is the independent variable, and particpant "id" is entered as a random effect.

I originally ran the model like this:

m2q.2 = lmer(points ~ rank + I(rank^2) + (1 | id), data = m2.7)
summary(m2q.2)

And it returns this:

> m2q.2 = lmer(points ~ rank + I(rank^2) + (1 | id), data = m2.7)
boundary (singular) fit: see help('isSingular')
> summary(m2q.2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: points ~ rank + I(rank^2) + (1 | id)
   Data: m2.7

REML criterion at convergence: 3874.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.3829 -0.5266 -0.1899  0.5085  6.6625 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept)  0.00    0.00    
 Residual             36.97    6.08    
Number of obs: 600, groups:  id, 60

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)  23.83917    0.92325 597.00000  25.821   <2e-16 ***
rank         -4.65554    0.38559 597.00000 -12.074   <2e-16 ***
I(rank^2)     0.30562    0.03416 597.00000   8.946   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
          (Intr) rank  
rank      -0.909       
I(rank^2)  0.814 -0.975
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

The addition of the quadratic term improved the model, but it failed to converge because of the near 0 variance ofthe random effects. After scouring help boards, most people suggest removing the random effect when this happens since it is overfitting the model anyway, but in this case that would involve running a linear model which almost certainly attains a significant result due to pseudoreplication. This occurs whether I add a quadratic term or not so it seems that it's happening because of the random effect.

and I tried setting REML to FALSE, too, but it still warns of singularity.

     AIC      BIC   logLik deviance df.resid 
  3875.8   3897.8  -1932.9   3865.8      595

I suspect the real issue is the low variance in because of the zero-sum nature of the dependent variable: the random effect of participant "id" is overfitting the model because "id" slope and intercept are so similar across participants.

So my question is this: is it possible to run a regression model which nests participant "id" without actually adding "id" as a random intercept OR slope? If so, is that even what I want to do, or is there another way about it? Is the addition of the quadratic term simply not significantly improving the model?

Thanks so much!

Running a linear mixed model regression when a zero-sum dependent variable causes near-zero variance in random effects

0 Answers0