1

I´m trying to fit some path models (i.e. all variables are observed; no latent variables) using “lavaan” in R. I´ve been able to do this successfully for a model where the data are completely pooled (Model 1, below). But, the data are grouped and I´d like to fit a models that account for groups as fixed effects (Model 2, below) and random effect (i.e. random intercept by group; Model 3, below).

I´ve looked at the user manual and various other online resources, but I´m having trouble working out how to code the fixed and random effects models.

I´m hoping someone might be able to provide some advice on this.

I´ve include simplified versions of the data and models I´m trying to fit below. (I´m using a path model as the real data includes more predictors and indirect paths).

Dataset: the variables are 4 predictors (P1-4); 1 outcome (Outcome); 4 groups (each observation falls within one of four groups: G1-4 are dummy variables). All variables are observed (i.e. no latent variables).

Model 1: path model without accounting for groups (i.e. complete pooling)
This appears to work fine.

model1 <- "
#regression equations
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4
# variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit1 <- lavaan(model1, data=mydata)

Model 2: group fixed effects
I´m not sure how to do this…
Question: Is this done by including all but one of the group dummy variables as exogenous variables, specifying paths from each dummy variable to the outcome, as well as including a variance term for each dummy? That is:

Model2 <- "
#regression equations
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4 + G2 + G3 + G4
#variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
G2 ~~ G2
G3 ~~ G3
G4 ~~ G4
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit2 <- lavaan(model2, data=mydata)

Model 3: random intercept for groups
I see you need to specify the level 1 (observation level) and level 2 (group level) equations. I´m not sure how to do it correctly, but my attempt is below.
Question: What is the correct way to specify a model that has random intercepts for groups? And, when fitting the model, how do I specify cluster correctly?

Model3 <- "
#regression equations
level 1:
P2 ~ P1
outcome ~ P1 + P2 + P3 + P4
level 2:
outcome ~ G2 + G3 + G4
# variance of exogenous vars
P1 ~~ P1
P3 ~~ P3
P4 ~~ P4
G2 ~~ G2
G3 ~~ G3
G4 ~~ G4
# covariance of exogenous vars
P3 ~~ P4
# residual var for endog
P2 ~~ P2
outcome ~~ outcome
# covar of endog vars (none)
"
fit3 <- lavaan(model3, data=mydata, cluster =”????”)

Any advice would be greatly appreciated!

Cheers
Simon

1 Answers1

1

Means and (co)variances of exogenous predictors are by default (fixed.x=TRUE) taken as given, so there is no need to estimate them (i.e., you can leave them out of your model syntax).

In Model 3, leave the G dummy codes out of the model. Use the name of the original grouping variable (with 4 levels) as the cluster= argument, which will invoke random intercepts for all modeled variables. Or, if you only specify a single-level model, the cluster= arguments triggers cluster-robust SEs and test statistics. That might be better than random intercepts because you only have $N=4$ at Level 2. ML-SEM gives highly biased estimates in small samples. But perhaps that is what your comparison of approaches is meant to demonstrate.

Terrence
  • 780
  • 4
  • 7