0

Is it possible to get an imputation using the package MICE even when all the values in the column are the same? Then it would impute just with that number.

Example:

test<-data.frame(var1=c(2.3,2.3,2.3,2.3,2.3,NA),var2=c(5.3,5.6,5.9,6.4,4.5,NA))
miceImp<-mice(test)
testImp<-complete(miceImp)

only imputate on var2. I would like it to replace the NA in var1 too with 2.3.

EngrStudent
  • 1,924
  • 31
  • 46
PrincessJellyfish
  • 149
  • 1
  • 1
  • 9

1 Answers1

1

You can use passive imputation for this. For a full explanation, see section 3.4 on page 25 of this article. As applied to constant variables, the objective here would be to set the imputation method for any constant variable x to the constant value of x. If the constant value of x is y, then the imputation method for x should be "~I(y)".

test = data.frame(
  var1=c(2.3,2.3,2.3,2.3,2.3,NA,2.3), 
  var2=c(5.3,5.6,5.9,6.4,4.5,5.1,NA), 
  var3=c(NA,1:6))
cVars = which(sapply(test,sd,na.rm=T)==0) #determine which vars are constant (props to SimonG)
allMeans = colMeans(test,na.rm=T) #get the column means
miceImp.ini = mice(test,maxit=0,print=F) #initial mids object with no imputations
meth = miceImp.ini$method #extract the imputation method vector
meth[cVars] = paste0("~I(",allMeans[cVars],")") #set the imputation method to be a constant (the current column mean)
miceImp = mice(test,method=meth) #run the imputation with the user defined imputation methods
testImp = complete(miceImp) #extract an imputedly complete dataset
View(testImp) #take a look at it

All that being said, constant values tend not to be of great use in statistics, so it might be more efficient to drop any constant variables before imputation (since imputation is such a costly process).

Paul de Barros
  • 1,170
  • 8
  • 22