2

I have a question about handling the following missing data scenario using the linear mixed effect model.

Suppose I have a closed longitudinal cohort followed by six years. There are 1500 individuals at the initial wave. Available observations by each wave are as the following:

Wave 1: 1500 Wave 2: 1400 Wave 3: 1000 Wave 4: 800 Wave 5: 500 Wave 6: 67

There are two reasons for the missing observations. First, people dropped out. Second, the data collection process is ongoing, and not all individuals have been interviewed yet (this is more likely in the later wave).

I know the linear mixed effect model can address the missing problem using the maximum likelihood if MAR or MCAR. My question is: if I assume all missing happens at random, should I drop observations from wave 6 to avoid biased estimates? Or in other words, if I assume the missingness in my data set is happened at random, should I drop a specific wave with substantial amount of missingness to avoid a biased estimate?

The model I would like to run is as the following:

m_Kunkle_exe <- lmer(cs_exec_fn ~ PRS_Kunkle*AgeAtVisit*APOE_score + 
                   PRS_Kunkle*I(AgeAtVisit^2)*APOE_score +
                   + gender + EdYears_Coded_Max20 +  VisNo + famhist + X1  + X2 + X3 + X4 + X5 +
                   (1 |family/DBID),
                 data = WRAP_all, REML = F)

Many thanks

zjppdozen
  • 63
  • 5
  • ? I don't see a "wave 7" in your data description; did you mean "wave 6"? Can you give an example of a formula (possibly simplified) that might describe your model? (It seems dangerous to assume that dropout is random, unless you know a lot about the dropout process ...) – Ben Bolker Oct 28 '21 at 19:52
  • @BenBolker, Hi Ben, I am sorry but I mean wave 6. I already fixed it in my question description. I added my model also in the question. – zjppdozen Oct 28 '21 at 20:15

0 Answers0