4

I am using LightGBM for a binary classification project. I use the built-in 'logloss' as the loss function. However, I want to use early_stopping to stop the iterations when it yields the highest Precision_Recall AUC value. So I have implemented the following custom eval function:

def f_pr_auc(probas_pred, y_true):
   labels=y_true.get_label()
   p, r, _ = precision_recall_curve(labels, probas_pred)
   score=auc(r,p) 
   return "pr_auc", score, True

This custom eval function works well and I have updates like the following:

enter image description here

However, the iterations stopped at the lowest logloss value but not at the highest pr_auc value. Is there a way that I can disable logloss evaluation and only evaluate pr_auc?

For imbalanced datasets, the highest pr_auc value may not be achieved at the lowest logloss. So I'd like to stop the iterations when the highest pr_auc is achieved.

Marco Cerliani
  • 21,233
  • 3
  • 49
  • 54
David293836
  • 1,165
  • 2
  • 18
  • 36

1 Answers1

4

With LGB Python API, you have to set in your parameters dictionary the custom metric option:

params = {
    ......
    'objective': 'binary',
    'metric': 'custom',
    ......
}

gbm = lgb.train(params,
                lgb_train,
                feval=f_pr_auc,
                valid_sets=lgb_eval)
Marco Cerliani
  • 21,233
  • 3
  • 49
  • 54