How to interpret column matrix to find best model for imbalanced dataset?

Question

I am trying to make binary classification and My dataset is imbalanced with a 1:7 ratio. I have 1000 "1" labels and 6990 "0" labels.

Predicting "1" Labels is more important than "0" but still, It should also detect "0" labels correctly as much as possible.

I used sampling techniques and used different models like XGBClassifier, LightGBM, SVM, KNN and I got different confusion matrixes. In some of them, detecting the "1" label is very good but detecting the "O" is not very good. And others, both "1" and "O" detecting are average.

I know accuracy is not a good metric to evaluate an imbalanced dataset, so I used the recall, f2 score, and AUC score. But still, I confused about which model is best.

According to these results, which model is best?

score 0 · Answer 1 · answered Jan 26 '21 at 20:06

One way is to validate your model is using different k-folds. Divide your data into 4 or 5 sets of train-test pairs. Get the results of the different tests and take an average. That should allow you to better understand the performance of the different models.

How to interpret column matrix to find best model for imbalanced dataset?

1 Answers1