0

I'm running this regression.

model1 <- lm(DV ~ IV1 + IV1 + IV3 + SubjectID, data = df)

I'm checking for multicollinearity between the variables. The SubjectID is the ID of each subject. Each subject has 8 observations and there are about 300 subjects. When I run the model above I get no errors. When I run car::vif I get an error that indicates there is multicollinearity in the model. I check the regression results and the model is saying three of the SubjectIDs are multicollinear. This really surprised me, my understanding is that only one of the SubjectIDs should be linearly dependent on the rest. Regardless, assume I know that SubjectID2, SubjectID3, and SubjectID4 are multicollinear, how do I drop them? My understanding is that if I simply subset them then R will describe other factors as linearly dependent, so I can't simply drop them.

Eric Tim
  • 53
  • 8
  • 2
    I trust a linear model is not suitable for your data. Perhaps ask at Cross Validated what types of mixed effects models are. – missuse May 05 '21 at 21:33
  • 1
    Please post the head of your dataset or each variables type and levels. It seems you have a repeared measures analysis? If so lm() is not a good option(mixed model is appropriare). And also why you put SubjectId in formula as a seperate IV? I mean how R could know this is a SubjectId and not a seperate IV? – Behnam Hedayat May 05 '21 at 22:04
  • Yes, it is a repeated measures analysis (although one of the variables is within-subjects, so maybe not). My understanding is that putting SubjectID as a variable is identical to using fixed effects. – Eric Tim May 05 '21 at 22:10

0 Answers0