Is model_fit$results$ROC
a vector (with size equal to the size of my tuning parameter lambda
) of the mean of the performance measure across resampling?
It is; to be precise, the length will be equal to the number of rows of your tuneGrid
, which here it happens to coincide with the length of your lambdaSeq
(since the only other parameter, alpha
, is being held constant).
Here is a quick example, adapted from the caret
docs (it is with gbm
and Accuracy
metric, but the idea is the same):
library(caret)
library(mlbench)
data(Sonar)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing <- Sonar[-inTraining,]
fitControl <- trainControl(method = "cv",
number = 5)
set.seed(825)
gbmGrid <- expand.grid(interaction.depth = 3,
n.trees = (1:3)*50,
shrinkage = 0.1,
n.minobsinnode = 20)
gbmFit1 <- train(Class ~ ., data = training,
method = "gbm",
trControl = fitControl,
tuneGrid = gbmGrid,
## This last option is actually one
## for gbm() that passes through
verbose = FALSE)
Here, gbmGrid
has 3 rows, i.e. it is consisted only of three (3) different values of n.trees
with the other parameters held constant; hence, the corresponding gbmFit1$results$Accuracy
will be a vector of length 3:
gbmGrid
# interaction.depth n.trees shrinkage n.minobsinnode
# 1 3 50 0.1 20
# 2 3 100 0.1 20
# 3 3 150 0.1 20
gbmFit1$results
# shrinkage interaction.depth n.minobsinnode n.trees Accuracy Kappa AccuracySD KappaSD
# 1 0.1 3 20 50 0.7450672 0.4862194 0.05960941 0.1160537
# 2 0.1 3 20 100 0.7829704 0.5623801 0.05364031 0.1085451
# 3 0.1 3 20 150 0.7765188 0.5498957 0.05263735 0.1061387
gbmFit1$results$Accuracy
# [1] 0.7450672 0.7829704 0.7765188
Each of the 3 Accuracy
values returned is the result of the metric in the validation folds of the 5-fold cross validation we have used as a resampling technique; more precisely, it is the mean of the validation accuracies computed in these 5 folds (and you can see that there is an AccuracySD
column, containing also its standard deviation).
And NOT the performance measure computed over the whole sample after re-estimating the model over the whole sample for each value of lambda?
Correct, it is not that.