I am doing a stack of models in R as follows:
ctrl <- trainControl(method="repeatedcv", number=5, repeats=3, returnResamp="final", savePredictions="final", classProbs=TRUE, selectionFunction="oneSE", verboseIter=TRUE)
models_stack <- caretStack(
model_list,
data=train_data,
tuneLength=10,
method="glmnet",
metric="ROC",
trControl=ctrl
)
1) Why am I seeing the following error? What can I do? I am stuck now.
Timing stopped at: 0.89 0.005 0.91
Show Traceback
Error in (function (x, y, family = c("gaussian", "binomial", "poisson", : unused argument (data = list(c(-0.00891097103286995, 0.455282701499392, 0.278236211515583, 0.532932725880776, 0.511036607368827, 0.688757947257125, -0.560727863490874, -0.21768155316146, 0.642219917023467, 0.220363129901216, 0.591732278371339, 1.02850020403572, -1.02417799431585, 0.806359545011601, -1.21490317454699, -0.671361009441299, 0.927344615788642, -0.10449847318776, 0.595493217624868, -1.05586363903119, -0.138457794869817, -1.026253562838, -1.38264471633224, -1.32900800143341, 0.0383617314263342, -0.82222313323842, -0.644251885665736, -0.174126438952992, 0.323934240274895, -0.124613523895458, 0.299359713721601, -0.723599218327519, -0.156528054435544, -0.76193093842169, 0.863217455799044, -1.01340448660914, -0.314365383747751, 1.19150804114605, 0.314703439577839, 1.55580594654149, -0.582911462615421, -0.515291378382375, 0.305142268138296, 0.513989405541095, -1.85093305614114, 0.436468060668601, -2.18997828727424, 1.12838871469007, -1.17619542016998, -0.218175589380355
2) Is there not supposed to have a "data" parameter? If i need to use a different dataset for my level 1 supervisor model what I can do?
3) Also I wanted to use AUC/ROC but got these errors
The metric "AUC" was not in the result set. Accuracy will be used instead.
and
The metric "ROC" was not in the result set. Accuracy will be used instead.
I saw some online examples that ROC can be used, is it because it is not for this model? What metrics can I use besides Accuracy for this model? If I need to use ROC, what are the other options.
As requested by @RLave, this is how my model_list is done
grid.xgboost <- expand.grid(.nrounds=c(40,50,60),.eta=c(0.2,0.3,0.4),
.gamma=c(0,1),.max_depth=c(2,3,4),.colsample_bytree=c(0.8),
.subsample=c(1),.min_child_weight=c(1))
grid.rf <- expand.grid(.mtry=3:6)
model_list <- caretList(y ~.,
data=train_data_0,
trControl=ctrl,
tuneList=list(
xgbTree=caretModelSpec(method="xgbTree", tuneGrid=grid.xgboost),
rf=caretModelSpec(method="rf", tuneGrid=grid.rf)
)
)
My train_data_0 and train_data are both from the same dataset. My dataset predicators are all numeric values with the label as a binary label