LightGBM : validation AUC score during model fit differs from manual testing AUC score for same test set

Question

I have a LightGBM Classifier with following parameters:

lgbmodel_2_wt = LGBMClassifier(boosting_type='gbdt',
                        num_leaves= 105, 
                        max_depth= 11,
                        learning_rate=0.03,
                        n_estimators= 5000,
                        categorical_feature=[0,1,3,4,5,6,7,8,9,10,11,12,13,14,15],
                        objective='binary',
                        class_weight= {0: 0.6, 1: 1},
                        min_split_gain=0.01,
                        min_child_weight=2,
                        min_child_samples=20,
                        subsample=0.9,
                        colsample_bytree=0.8,
                        reg_alpha=0.1,
                        reg_lambda=0.1,
                        n_jobs= -1,
                        verbose= -1)

Following is the model fitting function call:

history = {}
eval_history = record_evaluation(history)
lgbmodel_2_wt.fit(
    X_train, y_train,
    eval_set= [(X_train, y_train), (X_test, y_test)],
    eval_metric='auc', verbose=500, early_stopping_rounds=30,
    callbacks= [eval_history])

The above fit returns the following evaluation results:

Training until validation scores don't improve for 30 rounds
[500]   training's auc: 0.902706    training's binary_logloss: 0.379436 valid_1's auc: 0.887315 valid_1's binary_logloss: 0.369
Early stopping, best iteration is:
[860]   training's auc: 0.909587    training's binary_logloss: 0.366997 valid_1's auc: 0.88844  valid_1's binary_logloss: 0.366346

The best AUC score is 0.88844 going by the above chart. However, results change when manually predicting the results for the same set i.e. 'X_test':

y_pred = lgbmodel_2_wt.predict(X_test)
roc_auc_score(y_test, y_pred)

The above code segment throws an AUC score of 0.7901740256981424 . Which AUC score am I supposed to consider correct as the scores are different for the same test set. LightGBM has limited online documentation and I have been having a hard time interpreting the results. Any help is appreciated.

Show the full code including your data. See how to provide [reprex] — Sergey Bushmanov, Apr 03 '20 at 14:31

score 2 · Answer 1 · answered Jul 01 '20 at 15:43

2

try

y_pred = lgbmodel_2_wt.predict_proba(X_test)[:, 1]

instead of

y_pred = lgbmodel_2_wt.predict(X_test)

answered Jul 01 '20 at 15:43

Levi

87
2
9

score 0 · Answer 2 · edited Jul 16 '21 at 02:23

Metoo! This solved my problem: https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier.fit

I found we need to add first_metric_only = True in model constructor such as:

gbm = LGBMClassifier(learning_rate=0.01, first_metric_only = True)

gbm.fit(train_X, train_Y,eval_set =[(test_X,test_Y)] , eval_metric=['auc'],
        early_stopping_rounds=10,verbose = 2)

LightGBM : validation AUC score during model fit differs from manual testing AUC score for same test set

2 Answers2