I have a multiply-imputed model from R's mice
package in which there are lots of factor variables. For example:
library(mice)
library(Hmisc)
# turn all the variables into factors
fake = nhanes
fake$age = as.factor(nhanes$age)
fake$bmi = cut2(nhanes$bmi, g=3)
fake$chl = cut2(nhanes$chl, g=3)
head(fake)
age bmi hyp chl
1 1 <NA> NA <NA>
2 2 [20.4,25.5) 1 [187,206)
3 1 <NA> 1 [187,206)
4 3 <NA> NA <NA>
5 1 [20.4,25.5) 1 [113,187)
6 3 <NA> NA [113,187)
imput = mice(nhanes)
# big model
fit1 = glm.mids((hyp==2) ~ age + bmi + chl, data=imput, family = binomial)
I want to test the significance of each entire factor variable in the model (not the indicator variables for each level) by testing the full model against each possible nested model that drops one variable at a time. Manually, I can do:
# small model (no chl)
fit2 = glm.mids((hyp==2) ~ age + bmi, data=imput, family = binomial)
# extract p-value from pool.compare
pool.compare(fit1, fit2)$pvalue
How can I do this automatically for all the factor variables in my model? The very helpful function drop1
was suggested to me for a previous question -- now I want to do something exactly like that except for the mice
case.
Possibly helpful note: An annoying feature of pool.compare
is that it appears to want the "extra" variables in the larger model to be placed after the ones that are shared with the smaller model.