I am only getting `accuracy_score` instead of `roc_auc` for XGBClassifier in both GridSearch and cross validation

Question

I am using XGBClassifier for the Rain in Australia dataset and trying to predict whether it will rain today or not. I wanted to tune the hyperparameters of the classifier with GridSearch and score it with ROC_AUC. Here is my code:

param_grid = {
    "max_depth": [3, 4, 5, 7],
    "gamma": [0, 0.25, 1],
    "reg_lambda": [0, 1, 10],
    "scale_pos_weight": [1, 3, 5],
    "subsample": [0.8],  # Fix subsample
    "colsample_bytree": [0.5],  # Fix colsample_bytree
}

from sklearn.model_selection import GridSearchCV

# Init the classifier
xgb_cl = xgb.XGBClassifier(objective="binary:logistic", verbose=0)

# Init the estimator
grid_cv = GridSearchCV(xgb_cl, param_grid, scoring="roc_auc", n_jobs=-1)

# Fit
_ = grid_cv.fit(X, y)

When the search is finally done, I am getting the best score with .best_score_ but somehow only getting an accuracy score instead of ROC_AUC. I thought this was only the case with GridSearch, so I tried HalvingGridSearchCV and cross_val_score with scoring set to roc_auc but I got accuracy score for them too. I checked this by manually computing ROC_AUC with sklearn.metrics.roc_auc_score.

Is there anything I am doing wrong or what is the reason for this behavior?

Out of curiosity, why are you storing the fitted search in `_`? — Arturo Sbr, Apr 08 '21 at 18:46
Otherwise, the cell will print out the whole classifier with its params, which clutters the notebook — Bex T., Apr 09 '21 at 05:29
How do you know that `best_score_` is giving accuracy? (I suspect your manual check is in error...) — Ben Reiniger, Apr 09 '21 at 20:52
I used `roc_auc_score` from `sklearn.metrics`. It raised an error first because the target was not encoded so I first encoded it. Then, I obtained predictions from the best estimator from GridSearch and passed encoded targets and predictions to `roc_auc_score`. Does that seem to be correct? — Bex T., Apr 10 '21 at 05:15

Patrick Bormann · Accepted Answer · 2021-04-10T08:20:09.880

Have you tried your own roc_auc scoring rule? It seems like you are passing labels instead of probabilities (you originally need) for roc_auc.

problem described in here: Different result roc_auc_score and plot_roc_curve

Solutions for own scorers: Grid-Search finding Parameters for AUC

Update2

Sorry, saw today that my introduction text from the notebook was missing lol

When calculating roc_auc_score you have the option (it doesnt matter, if it is with or without gridsearch, with or without pipeline) that you can pass it labels like (0/1) or probabilities like (0.995, 0.6655). The first should be easy available if you just convert your probas to labels. However that would result in a (straight reversed L) output plot. That looks sometimes ugly. the other option is to use predicted probabilites to pass them them to the roc_auc_score. That would result in a (staircase reversed L) output plot, which looks much better. So what you first should test is, can you get a roc auc score with labels, with and without grid, if that is the case. You should then try to get probabilities. And there, I believe, you have to write your own scoring method, as the roc-auc_score in grid only serves labels, that would result in high roc_auc scores. I wrote something for you, so you can see the label approach:

import xgboost as xgb
from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import roc_auc_score

cancer = load_breast_cancer()

X = cancer.data
y = cancer.target

xgb_model = xgb.XGBClassifier(objective="binary:logistic", 
                              eval_metric="auc", 
                              use_label_encoder=False,
                              colsample_bytree = 0.3, 
                              learning_rate = 0.1,
                              max_depth = 5, 
                              gamma = 10, 
                              n_estimators = 10,
                              verbosity=None)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) 
xgb_model.fit(X_train, y_train)
preds = xgb_model.predict(X_test)

print(confusion_matrix(preds, y_test))
print ('ROC AUC Score',roc_auc_score(y_test,preds))

Gives:

[[51  2]  
[ 3 87]] 
ROC AUC Score 0.9609862671660424

Here you can see it is ridicoulous high.

If you wanna do it with grid: get rid of this:

# Fit
_ = grid_cv.fit(X, y)

just grid_cv.fit(x, y) fit is a method applied to grid_cv and results are stored within grid_cv

print(grid_cv.best_score_) should deliver the auc as you already have defined it. See also: different roc_auc with XGBoost gridsearch scoring='roc_auc' and roc_auc_score? But this should also be ridicoulos high, as you will be probably serving labels instead of probas.

beware also of: What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

And nobody hinders you to apply the roc-auc_score function to your grid_results...

But using `roc_auc_score` outside the grid search or CV is working without any errors. I am sorry but I am new to sklearn and I didn't completely understand your answer — Bex T., Apr 09 '21 at 15:20
If you use roc_auc your results are optimized (because its grid), then you can just use roc-auc function on the grid_results if you like, but it should also give you the appropriate roc score from the grid, have you not tested it with a confusion matrix and calculated your own roc_auc_score from the conf-matrix of your grid, so that you can be sure grid_cv.score_ equals roc-auc? — Patrick Bormann, Apr 09 '21 at 19:52
If roc_auc differs, too strong maybe you have to insert probabilities instead of labels, as i already said. — Patrick Bormann, Apr 09 '21 at 19:58
So, I should get the predictions using `predict_proba` instead of just `predict`? — Bex T., Apr 10 '21 at 05:18
Hey, yes, you should try predict_proba but im unsure, if it is possible with xgboost, if not you have to write your own. i updates my answer as my introductory text from my notebook was missing somehow. I wrote you the example for labels with roc_auc. that should always be possible for grid or without grid. But when you want to do probas, i believe you have to write it yourself, i believe xgboost does not deliver predict_proba. But overall yes, I believe you should write your own scoring "predict_proba" — Patrick Bormann, Apr 10 '21 at 08:22

I am only getting `accuracy_score` instead of `roc_auc` for XGBClassifier in both GridSearch and cross validation

1 Answers1