0

I want to perform multigroup SEM on imputed data using the R packages mice and semTools, specifically the runMI function that calls Lavaan.

I am able to do so when imputing the entire dataset at once, but whilst trawling through stackoverflow/stackexchange I have come across the recommendation to impute data separately for each level of a grouping variable (e.g. men, women), so that the features of each group are preserved (e.g. https://stats.stackexchange.com/questions/149053/questions-on-multiple-imputation-with-mice-for-a-multigroup-sem-analysis-inclu). However, I've not been able to find any references to support this course.

My question is both conceptual and practical -

1) Is splitting the dataset by group prior to imputing the correct course? Could anyone point me towards references advising this?

2) If so, how can I combine the datasets imputed by group using mice together, whilst still retaining multiple imputed datasets in a list of dataframes of the mids class? I have attempted to do so, but end up with an integer

set.seed(12345)
HSMiss <- HolzingerSwineford1939[ , paste("x", 1:9, sep = "")]
HSMiss$x5 <- ifelse(HSMiss$x1 <= quantile(HSMiss$x1, .3), NA, HSMiss$x5)
HSMiss$x9 <- ifelse(is.na(HSMiss$x5), NA, HSMiss$x9)
HSMiss$school <- HolzingerSwineford1939$school

HS.model <- '
visual  =~ x1 + a*x2 + b*x3
textual =~ x4 + x5 + x6
x7 ~ textual + visual + x9
'

group1 <- subset(HSMiss, school =='Pasteur')
group2 <- subset(HSMiss, school =='Grant-White')

imputed.group1 <- mice(group1, m = 3, seed = 12345) 
imputed.group2 <- mice(group2, m = 3, seed = 12345) 


 #attempted merging:
    imputed.both <- nrow(complete(rbind(imputed.group1, imputed.group2)))

I would be incredibly grateful if anyone can offer me some help. As you can tell, I am very much still learning about R and imputation, so apologies if this is a stupid question - however, I couldn't find anything regarding this specific query elsewhere.

L. Bakker
  • 147
  • 1
  • 13
S Atkins
  • 3
  • 3

1 Answers1

1

You are getting just an integer when mergin because you are calling nrow(). Remove that call and you'll get a merged data frame.

imputed.both <- complete(rbind(imputed.group1, imputed.group2))

In case you find yourself with datasets that have multiple groups, you can something like the following to simplify this task.

imputed.groups <- lapply(split(HSMiss, HSMiss$school), function(x) {
  complete(mice(x, m = 3, seed = 12345))
})

imputed.both <- do.call(args = imputed.groups, what = rbind)

About how appropiate is this approach for imputing, that's probably a question better suited for Cross Validated.

Juan Bosco
  • 1,420
  • 5
  • 19
  • 23
  • Its been a while since the last update of this question..... The code works nicely for my data, but i end up with list containing the data saved for each group, but as the full data set. So, what is the point of the last line of code and which of the imputed lists to take? Thanks – Juan Nov 11 '19 at 12:23
  • 1
    The last line binds the lists created with lapply, one by group. It's just a convenience step that can be skipped if you don't want a single data frame with all your results. How to pick an imputed dataset is case specific. – Juan Bosco Nov 12 '19 at 17:23