I'm using BayesSearchCV from scikit-optimize
to train a model on a fairly imbalanced dataset. From what I'm reading precision or ROC AUC would be the best metrics for imbalanced dataset. In my code:
knn_b = BayesSearchCV(estimator=pipe, search_spaces=search_space, n_iter=40, random_state=7, scoring='roc_auc')
knn_b.fit(X_train, y_train)
The number of iterations is just a random value I chose (although I get a warning saying I already reached the best result, and there is not a way to early stop as far as I'm aware?). For the scoring parameter, I specified roc_auc
, which I'm assuming it will be the primary metric to monitor for the best parameter in the results. So when I call knn_b.best_params_
, I should have the parameters where the roc_auc metrics is higher. Is that correct?
My confusion is when I look at the results using knn_b.cv_results_
. Shouldn't the mean_test_score
be the roc_auc
score because of the scoring param in the BayesSearchCV class? What I'm doing it plotting the results and seeing how each combination of params performed.
sns.relplot(
data=knn_b.cv_results_, kind='line', x='param_classifier__n_neighbors', y='mean_test_score',
hue='param_scaler', col='param_classifier__p',
)
When I try to use to roc_auc_score()
function on the true and predicted values, I get something completely different.
Is the mean_test_score
here different? How would I be able to get the individual/mean roc_auc score of each CV/split of each iteration? Similarly for when I want to use RandomizedSearchCV or GridSearchCV.
EDIT: tldr; I want to know what's being computed exactly in mean_test_score
. I thought it was roc_auc
because of the scoring param, or accuracy, but it seems to be neither.