4

I used mice to impute five missing data sets, saved as the object "allImputations" in the code below. I then needed to complete linear and dichotomous regression analyses across the imputed data sets (see below for a successful example):

SIStep2a<-with(allImputations, lm(Y1~X1+X2+X3)) 
SIStep2a<-as.mira(SIStep2a)
summary(pool(SIStep2a))
pool.r.squared(SIStep2a, adjusted = FALSE)

The above code above provides all the information I need for linear models, but I run into problems when I use glm to perform a logit regression in the same dataset(s).

treat.Step1a<-with(allImputations, glm(Y2~X1+X2+X3, family=binomial))
treat.Step1a<-as.mira(treat.Step1a)
summary(pool(treat.Step1a))

In this instance, I need a pooled pseudo R2 or other pooled model fit index (similar to the pool.r.squared function). However, I cannot find a way to produce either pooled model fit indices OR the fit indices for each analysis of the five imputed data sets.

Essentially, is there a pool.r.squared analog for glm analyses across multiply imputed datasets from mice? Or is there a longhand way to calculate this via the info in the saved object "treat.Step1a" above? Or is there a way to isolate fit indices for each of the five analyses completed for each imputed data set?

Update

I was able to download a package directly from GitHub (glmice), which was no longer available via CRAN. However, the command mcf() would not successfully execute in my current R Studio version.

I ultimately ran each step of the model (i.e. as I added each block of variables) across all five imputed data sets and averaged the McFadden's R2 of all five imputed datasets to very roughly assess the improvement in Pseudo R2.

Is this an acceptable middle ground approach?

Clar_k
  • 41
  • 2

0 Answers0