I am getting an odd error
Error in `[.data.frame`(data, , lvls[1]) : undefined columns selected
message when I am using caret to train a glmnet model. I have used basically the same code and the same predictors for an ordinal model (just with a different factor y
then) and it worked fine. It took 400 core hours to compute so I cant show it here though).
#Source a small subset of data
source("https://gist.githubusercontent.com/FredrikKarlssonSpeech/ebd9fccf1de6789a3f529cafc496a90c/raw/efc130e41c7d01d972d1c69e59bf8f5f5fea58fa/voice.R")
trainIndex <- createDataPartition(notna$RC, p = .75,
list = FALSE,
times = 1)
training <- notna[ trainIndex[,1],] %>%
select(RC,FCoM_envel:ATrPS_freq,`Jitter->F0_abs_dif`:RPDE)
testing <- notna[-trainIndex[,1],] %>%
select(RC,FCoM_envel:ATrPS_freq,`Jitter->F0_abs_dif`:RPDE)
fitControl <- trainControl(## 10-fold CV
method = "CV",
number = 10,
allowParallel=TRUE,
savePredictions="final",
summaryFunction=twoClassSummary)
vtCVFit <- train(x=training[-1],y=training[,"RC"],
method = "glmnet",
trControl = fitControl,
preProcess=c("center", "scale"),
metric="Kappa"
)
I cant find anything obviously wrong with the data. No NAs
table(is.na(training))
FALSE
43166
and dont see why it would try to index outside of the number of columns.
Any suggestions?