According to the documentation of the mice
package, if we want to impute data when we're interested in interaction terms we need to use passive imputation. Which is done the following way.
library(mice)
nhanes2.ext <- cbind(nhanes2, bmi.chl = NA)
ini <- mice(nhanes2.ext, max = 0, print = FALSE)
meth <- ini$meth
meth["bmi.chl"] <- "~I((bmi-25)*(chl-200))"
pred <- ini$pred
pred[c("bmi", "chl"), "bmi.chl"] <- 0
imp <- mice(nhanes2.ext, meth = meth, pred = pred, seed = 51600, print = FALSE)
It is said that
Imputations created in this way preserve the interaction of bmi with chl
Here, a new variable called bmi.chl
is created in the original dataset. The meth
step tells how this variable needs to be imputed from the existing ones. The pred
step says we don't want to predict bmi
and chl
from bmi.chl
. But now, if we want to apply a model, how do we proceed? Is the product defined by "~I((bmi-25)*(chl-200))"
is just a way to control for the imputed values of the main effects, i.e. bmi
and chl
?
If the model we want to fit is glm(hyp~chl*bmi, family="binomial")
, what is the correct way to specify this model from the imputed data? fit1
or fit2
?
fit1 <- with(data=imp, glm(hyp~chl*bmi, family="binomial"))
summary(pool(fit1))
Or do we have to use somehow the imputed values of the new variable created, i.e. bmi.chl
?
fit2 <- with(data=imp, glm(hyp~chl+bmi+bmi.chl, family="binomial"))
summary(pool(fit2))