I have a cv.glmnet to use to predict new data. I have a problem when creating the model matrix for new data to be predicted using cv.glmnet object. I need to block bootstrap the test data and predict the response for all samples. The problem happens when in some samples, some of the categorical variables have only one level. Then I get an error when creating the model matrix. Here is an example.
library(splines)
library(caret)
library(glmnet)
data(iris)
Inx <- sample(nrow(iris),100)
iris$Species <- factor(iris$Species)
train_data <- iris[Inx, ]
test_data <- iris[-Inx,]
Formula <- "Sepal.Length ~ Sepal.Width + Petal.Length + Species:Petal.Width + Sepal.Width:Petal.Length + Species + bs(Petal.Width, df = 2, degree = 2)"
ModelMatrix <- predict(caret::dummyVars(Formula, train_data, fullRank = T, sep = ""), train_data)
y = train_data[,"Sepal.Length"]
cvglm <- cv.glmnet(x = ModelMatrix,y = train_data$Sepal.Length,nfolds = 4,
keep = TRUE, alpha = 1, parallel = F, type.measure = 'mse')
test_data$Species <- "virginica"
ModelMatrix_test <- predict(caret::dummyVars(Formula, test_data, fullRank = T, sep = ""), test_data)
Then I get this error
Error in
contrasts<-
(*tmp*
, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels
Any suggestions to solve the problem would be appreciated.