I am trying to perform multiple imputation on a dataset with cross-classified nesting (i.e., data are nested within two different grouping variables that are not nested within each other; group1 and group2 in the code below). To account for the multilevel structure, I am using method="2l.pmm" in the miceadds package. However, when I run the code, I get an error message:
"Error in z0* u[index_clus, 1:NR, drop = FALSE] : non-conformable arrays
I'd like some ideas on why I am getting this message.
The complete dataset that I am trying to impute has 279 rows and 188 columns. Because I adopted planned missingness in my study, about 120 of these columns need imputing. To narrow down the cause of the error code, I have tried greatly reducing the dataset to the two grouping variables and a couple of other variables. It seems that the problem occurs when I add the second grouping variable (group2) as a predictor.
library(mice)
library(miceadds)
#Partial dataset
dat <- data.frame(
group1 = #Grouping variable 1
c(44,17,12,3,19,28,28,16,47,33,28,42,50,38,22,33,15,44,33,28,7,47,38,16,49,23,11,17,28,50,49,17,38,31,28,49,17,22,26,11,45,10,26,7,60,7,17,37,44,16),
group2 = #Grouping variable 2
c(34,26,40,6,40,9,11,40,36,36,9,39,36,36,36,36,36,36,36,7,3,40,4,40,7,36,36,26,11,36,7,40,36,11,9,7,26,40,36,36,36,31,36,19,31,19,36,7,36,11),
VarA = #Response to a survey item
c(3,2,3,NA,NA,5,4,4,NA,2,3,NA,NA,4,4,3,NA,3,3,NA,2,NA,NA,2,NA,4,NA,NA,5,NA,NA,4,4,NA,4,5,2,4,1,NA,2,3,NA,NA,4,3,NA,2,2,NA),
VarB = #Response to another survey item
c(4,NA,NA,3,2,3,1,NA,3,2,NA,5,2,NA,2,NA,1,2,NA,2,NA,NA,NA,3,4,NA,2,NA,NA,NA,2,3,4,4,4,4,2,NA,NA,2,1,3,2,2,3,3,NA,NA,1,4))
#Imputation method
medat <- make.method(dat)
medat[c("group1","group2","VarA","VarB")] <- c("","","2l.pmm","2l.pmm")
#Predictor matrix
mepred <- make.predictorMatrix(dat)
mepred[c("VarA","VarB"),c("group1","group2")] <- -2
mepred[c("group1","group2")] <- 0
#Imputation
impdat <- mice(dat,method=medat,predictorMatrix=mepred,m=5,maxit=5,seed=5000)
I have suspected that the error may be due to some groups only having a sample size of 1; however, the code runs when the variables included are only group1, VarA, and VarB. There are some levels of group1 that only have a sample size of 1, so this does not seem to be the issue.
What are some potential problems with the code or the dataset that may be causing the error? Any input would be appreciated!