3

I'm using library(mice) to impute missing data. I want a way to tell mice that the ID variables should be included on the imputed data set but not used for the imputations.

For instance

#making a silly data frame with missing data
library(tidyverse)
library(magrittr)
library(mice)

d1 <- data.frame(
  id = str_c(
    letters[1:20] %>% 
      rep(each = 5),
    1:5 %>% 
      rep(times  = 20)
    ),
  v1 = runif(100),
  v2 = runif(100),
  v3 = runif(100)
  )

d1[, -1] %<>%
  map(
    function(i){

      i[extract(sample(1:100, 5, F))] <- NA

      i
      }
    )

This is the returned mids object

m1 <- d1 %>% 
  select(-id) %>% 
  mice

How can I include d1$id as a variable in in each of the imputed data frames?

tomw
  • 3,114
  • 4
  • 29
  • 51

1 Answers1

3

There are two ways. First, simply append id to the imputed datasets

d2 <- complete(m1,'long', include = T) # imputed datasets in long format (including the original)
d3 <- cbind(d1$id,d2) # as datasets are ordered simply cbind `id`
m2 <- as.mids(d3) # and transform back to mids object

This ensures that id has no role in the imputation process, but is a bit sloppy and prone to error. Another way is to simply remove it from the predictor matrix.

The 2011 manual by Van Buuren & Groothuis-Oudshoorn says: "The user can specify a custom predictorMatrix, thereby effectively regulating the number of predictors per variable. For example, suppose that bmi is considered irrelevant as a predictor. Setting all entries within the bmi column to zero effectively removes it from the predictor set ... will not use bmi as a predictor, but still impute it."

To do this

ini <- mice(d1,maxit=0) # dry run without iterations to get the predictor matrix

pred1 <- ini$predictorMatrix # this is your predictor matrix
pred1[,'id'] <- 0 # set all id column values to zero to exclude it as a predictor

m1 <-mice(d1, pred = pred1) # use the new matrix in mice

You can also prevent mice from imputing the variable, but as it contains no missing values this is not necessary (mice will skip it automatically).

Niek
  • 1,594
  • 10
  • 20