The documentation for the glm()
function states, regarding a factor response variable, that
the first level denotes failure and all others success.
I assume caret's train()
function calls glm()
under the hood when using method = 'glm'
, and therefore the same applies.
So in order to produce an interpretable model that is consistent with other models (i.e. the coefficients correspond to a success
event), I must follow this convention.
The problem is that, even though glm(), and thus caret's train()
function treats the second level factor as a success, caret's resamples
function (and $resample
variable) still treat the first level as success
/ positive
, and therefore sensitivity
and specificity
are the opposite of what they should be if i want to use resamples()
to compare against other models..
Example:
install.packages('ISLR')
library('ISLR')
summary(Default)
levels(Default$default) # 'yes' is second level on factor
glm_model <- glm(default ~ ., family = "binomial", data = Default)
summary(glm_model)
train_control <- trainControl(
summaryFunction = twoClassSummary,
classProbs = TRUE,
method = 'repeatedcv',
number = 5,
repeats = 5,
verboseIter = FALSE,
savePredictions = TRUE)
set.seed(123)
caret_model <- train(default ~ ., data = Default, method = 'glm', metric='ROC', preProc=c('nzv', 'center', 'scale', 'knnImpute'), trControl = train_control)
summary(caret_model)
caret_model # shows Sens of ~0.99 and Spec of ~0.32
caret_model$resample # shows same, but for each fold/repeat; by now, resamples are already the opposite of what they should be, which will propagate to resamples() method, no way to specify positive/success class in train()?
confusionMatrix(data = predict(caret_model, Default), reference = Default$default, positive = 'Yes') # once I set 'Yes' as positive class, the true sensitivity and specificity are calculated, but no way to do this for resamples()?
I can see the correct sens/spec in confusionMatrix
with positive = 'Yes'
but what is the solution for resamples()
so that I can accurately compare it against other models?