0

I should have successfully used mice to do multiple imputation on a data frame. I would now like to run glm on that dataset. My outcome variable is "MI" and my independent variables are "Hypertension" and "Diabetes". I have tried:

dat <- mice(regression)
model <- which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))

But I get the following error:

Error in which(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")):
argument to 'which' is not logical.

Does anybody know why this is?

slamballais
  • 3,161
  • 3
  • 18
  • 29
John M.
  • 51
  • 5
  • Sorry I did not mention about that... If I run your code I obtain: 'Error cannot coerce class ‘"mids"’ to a data.frame'. – John M. Nov 14 '20 at 20:08
  • `which()` is used incorrectly here. `which()` is used to output an index number(s) for all "TRUE" conditions, like asking, "Which numbers, from 11 to 15, is less than 13?" (i.e., `which(11:15 < 13)`) and you get back 1 and 2 because the first and second numbers in 11:15 are less than 13, i.e., both TRUE and rest FALSE). You need to provide a logical object(s) inside `which()`--i.e., TRUE or FALSE. – LC-datascientist Nov 14 '20 at 23:59
  • I think you wanted to use `with()` instead of `which()`. – LC-datascientist Nov 15 '20 at 00:04

1 Answers1

1

I think you are getting an error because you are using which() instead of with(). which() is a function that ask (in layperson's term), "Which of these values are true?" You have to specify something that can be true or false.

with() is a function that's like, "With this dataset, evaluate something something inside it." You have to provide some kind of data environment (e.g., a list, a data frame), and use vectors that are inside without needing to naming that data environment again.

with() can be used with the mice package like this:

# example data frame
set.seed(123)
df <- data.frame(
    MI = factor(rep(c(0,1),5)), 
    Hypertension = c(rnorm(9), NA), 
    Diabetes = c(NA, rnorm(9)))

# imputation
library(mice)

dat <- mice(df)

with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial"))

with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")) shows you the glm() outputs for each imputation in dat. mice() does five imputations by default.

An alternative with glm.mids()

Why doesn't glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) work? It gives an error, "Error cannot coerce class ‘mids’ to a data.frame" because the imputed dat is not a data frame.

Instead, mice has a function for running glm() with multivariate imputed data ("mids"), glm.mids():

#glm(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it does not work

glm.mids(MI ~ Hypertension + Diabetes, family = "binomial", data = dat) # it works

with(dat, glm(MI ~ Hypertension + Diabetes, family = "binomial")) # does the same thing

Edit Note

When you use with() while you are using the mice package, I think it actually calls with() from mice package's "with.mids", which allows you to use with() with mice package's "mids" data class. It supersedes glm.mids(). See here for details: https://rdrr.io/cran/mice/man/with.mids.html

LC-datascientist
  • 1,960
  • 1
  • 18
  • 32