0

I would like to study differences in fat between 2 visits with a linear mixed effects model. So everything would start as lme(fat~, now... for the coefficients, I have some that will change from visit 1 to visit 2, as they are hypertension status, diabetis status, bmi, waist circunference, smoking_status etc. And other variables that won't change from visit 1 to visit 2, as they are gender or ethnicity.

Note that the following variables are dummy (hypertension status, diabetis status, smoking_status, gender) while the following are continuous (bmi, waist circunference, age).

My initial model using nlme package was expressed as:

lme(fat~ diabetes_status + hypertension_status + bmi + waist + smoker + gender + ethnicity, random= ~1|PatientID/Visit, data = df_1, na.action = na.omit)

visit has 2 levels (1 and 2)

However, I have been told that those variables which change over time should be random effects while all the others should be fixed. In another question from stackoverflow (specifying multiple separate random effects in nlme) I read that nlme is not good for specifying crossed effects (aka, multiple separated random effects) and that lme4 package would handle this best.

I tried multiple ways of doing this:

attempt_1 = lmer(fat ~ gender + ethnicity + (1|diabetes_status) + (1|hypertension_status) + (1|PatientID/visit), data=df_1, REML=TRUE)

attempt_2 = lmer(fat ~ gender + ethnicity + (1|diabetes_status) + (1|hypertension_status) + (1|PatientID/visit), data=df_1, REML=FALSE)

attempt_3 = lmer(fat ~ gender + ethnicity + (diabetes_status+hypertension_status|PatientID/visit), data=df_1, REML=TRUE)

attempt_4 = lmer(fat ~ age + ethnicity + (1|diabetes_status) + (1|hypertension_status) + (1|PatientID/visit), data=df_1, REML=FALSE)

attempt_5 = lmer(fat ~ age + ethnicity + (1+diabetes_status+hypertension_status|PatientID/visit), data=df_1, REML=TRUE)

But none of these attempts work, and the error is always the same: Error: number of levels of each grouping factor must be < number of observations I assume that this can be for one of these 3 reasons:

  1. The code is not correct in any of the attempts, if this is true, which would be the best way to express this?

  2. The random effects should really be fixed effects (so, in this case the right model would be lme(fat~ diabetes_status + hypertension_status + bmi + waist + smoker + gender + ethnicity, random= ~1|PatientID/Visit, data = df_1, na.action = na.omit)) which runs perfectly.

  3. Linear mixed effects models are not prepared to handle so many random effects.

Any thoughts? Thanks!

Lili
  • 547
  • 6
  • 19
  • 1
    I'd go for the 3rd option. It looks like your data is too small. if `nrow(df_1)` is smaller than the number of your dummies [= the number of variables in your model], you can't make a linear model. – Edo Aug 07 '20 at 10:28
  • Are you sure? nrow (df_1) = 5550 - Thanks for your answer! – Lili Aug 07 '20 at 10:30
  • Do you have one line per patient? because `PatientID` is in the formula – Edo Aug 07 '20 at 10:32
  • Each patient appears in 2 lines (1 line per visit) – Lili Aug 07 '20 at 10:32
  • 2
    The names of your `_status` variables to me indicate that they are not something like an ID. They also probably have very few possible values. That indicates they should be fixed effects. To me, it appears you should go with the model in "reason 2" and maybe try to add interactions to the fixed effects or/and slopes to the random effects. I also don't understand why you nest `Visit` within `PatientID`. If you have only two visits per patient, I think, it should either be a fixed effect or omitted from the model. – Roland Aug 07 '20 at 10:34
  • @Roland, the _status variables are TRUE/FALSE variables. Could you give me a couple of examples on how would you code the interations to the fixed effects and the slopes to the random effects? I also think visit shouldn't be nested in the patientID... I wouldn't use visit at all. But how can I code this to assess whether change in fat is determined by any of the coefficients? - Thanks so much!! – Lili Aug 07 '20 at 10:40
  • `help("formula")` explains how to code interactions. Random slopes are coded just like fixed slopes and come before the `|` in random effects. – Roland Aug 07 '20 at 10:43
  • cool! one last question: Would this give me an assessment of whether change in fat between visit 1 and 2 is determined by any of the coefficients? – Lili Aug 07 '20 at 10:44
  • And a reason to use `visit` is that it could have a systematic effect if a first visit already helped alleviate some symptoms or achieved a change in behavior. But it would be a fixed effect. – Roland Aug 07 '20 at 10:45
  • 1
    IIUC this strikes me as the kind of model where you could difference the outcome (value at visit 1 minus value at visit 2) and then just use ordinary linear regression. Although FHarrell [suggests](https://stats.stackexchange.com/questions/15713/is-it-valid-to-include-a-baseline-measure-as-control-variable-when-testing-the-e#15795) why bother with computing the change. – user20650 Aug 07 '20 at 13:03
  • well spotted, I also tried what F Harrell suggests, but then you'd need just to use your coefficients with baseline measures, rather than baseline and follow up measures. And actually maybe the fact that someone had hypertension and has not hypertension anymore, makes a difference in the changes in fat. Does this make sense? - but thanks for your comment! always helpful to bring new views :) – Lili Aug 07 '20 at 13:21
  • ah okay, I missed that your covariates were changing – user20650 Aug 07 '20 at 13:29

0 Answers0