0

I'm dealing with a imbalanced class classification problem in which i have imbalanced ratio as 0:1 = 717.26:1. I tried many models out of which i found GBM working best for my case.

Than i came across a research paper and an article to deal with imbalanced class problem.

Facing Imbalanced Data Recommendations for the Use of Performance Metrics

Handling Class Imbalance with R and Caret - Caveats when using the AUC

in both of the above paper and article i found they are saying opposite thing.

From research paper

It says "We discovered that with exception of area under the ROC curve, all performance metrics were attenuated by imbalanced distributions; in many cases, dramatically so. Alpha and kappa measures were affected by skew in either direction; whereas F1-score was affected by skew only in one direction. While ROC was unaffected by skew, precision-recall curves suggest that ROC may mask poor performance" that means AUC PR also get affected as shown in the picture

while in the article Dan Martin(author) said only AUC ROC should not be used to select best classifier. we should take AUC PR also in picture when dealing with imbalanced class learning.

Now my question is if i consider result from research paper to be true than it will contradict the mentioned article results.

So can somebody tell which should be considered correct?

Sorry for the lengthy question.

Thanks in advance!

  • It's one of those cases where each data set is different and you may get completely different results for the same metrics. There is no best measure, but it would be wise to use proper scoring rules and not classification metrics such as accuracy, F1, and so on. – user2974951 Feb 28 '20 at 12:17
  • Yes, we can use accuracy and F1 but as far as i know in the case of imbalanced class learning accuracy will give miss leading results and if we target for F1 then F1 is actually a harmonic mean of precision and recall. So i believe i'll end up targeting AUC ROC and AUC PR. – Lokesh Arya Feb 28 '20 at 12:22

0 Answers0