I'm working on a binary classification problem where I have ~30 features of enzyme substrates to predict EC1 and EC2. I'm using xgboost with optuna for hyperparameter tuning. However, I'm observing a discrepancy between the AUC ROC values reported by Optuna and the scikit-learn library.
The output from optuna:
AUC ROC score 1: 0.7109184689577985
AUC ROC score 2: 0.6030927230046949
But the AUC ROC scores using sklearn for the best parameters found using optuna are:
AUC ROC score 1: 0.7065598459411416
AUC ROC score 2: 0.5656470070422535
The code for it goes like this:
import xgboost as xgb
import optuna
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
import numpy as np
# Setting a fixed random seed for reproducibility
np.random.seed(42)
def train_model(x_train, y_train, x_eval, y_eval):
def objective(trial):
param = {
'objective': 'binary:logistic',
'eval_metric': 'auc',
'n_estimators': trial.suggest_int('n_estimators', 100, 1000),
'max_depth': trial.suggest_int('max_depth', 3, 6),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.1, log=True),
'subsample': trial.suggest_float('subsample', 0.5, 1),
'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1),
'reg_alpha': trial.suggest_float('reg_alpha', 0, 10),
'reg_lambda': trial.suggest_float('reg_lambda', 0, 10),
'gamma': trial.suggest_float('gamma', 0.01, 1, log=True),
'random_state': 42,
'early_stopping_rounds': 10
}
model = xgb.XGBClassifier(**param)
model.fit(x_train, y_train, eval_set=[(x_eval, y_eval)], verbose=False)
y_pred = model.predict_proba(x_eval)[:, 1]
auc_roc = roc_auc_score(y_eval, y_pred)
return auc_roc
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
return study.best_trial.params, study.best_trial.value
# Splitting the data into train and evaluation sets
x_train, x_eval, y_train, y_eval = train_test_split(x_train, y_train, test_size=0.2, random_state=42)
# For EC1
best_params_1, best_auc_1 = train_model(x_train, y_train[:, 0], x_eval, y_eval[:, 0])
classifier_1 = xgb.XGBClassifier(**best_params_1)
classifier_1.fit(x_train, y_train[:, 0])
y_pred_1 = classifier_1.predict_proba(x_eval)[:, 1]
# For EC2
best_params_2, best_auc_2 = train_model(x_train, y_train[:, 1], x_eval, y_eval[:, 1])
classifier_2 = xgb.XGBClassifier(**best_params_2)
classifier_2.fit(x_train, y_train[:, 1])
y_pred_2 = classifier_2.predict_proba(x_eval)[:, 1]
auc_score_1 = roc_auc_score(y_eval[:, 0], y_pred_1)
auc_score_2 = roc_auc_score(y_eval[:, 1], y_pred_2)
I have implemented the xgboost model with hyperparameter tuning using optuna. I expected the AUC ROC values obtained from Optuna's output to be consistent with the AUC ROC values calculated using scikit-learn's roc_auc_score function. However, the actual results show a noticeable difference between these values.