-3

I am doing classification on a dataset with three classes (Labels Low, Medium, High).

I run the following code to get my confusion matrix:

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

And I get the following output for cm:

array([[18, 10],
       [ 7, 61]], dtype=int64)

What does this output mean? I read the following link but didn't understood Confusion Matrix and Class Statistics

smci
  • 32,567
  • 20
  • 113
  • 146
Sachin Yadav
  • 748
  • 9
  • 28
  • 1
    how about [scikit docs](https://scikit-learn.org/stable/modules/model_evaluation.html#confusion-matrix)? – alex Mar 05 '20 at 10:07
  • 1
    [tag:spyder] has nothing at all to do with this, it's only an IDE, it's not going to affect how your code runs. The relevant tags are [tag:scikit-learn], [tag:classification], [tag:multiclass-classification] – smci Mar 05 '20 at 10:26
  • If you have three different classes in `y`, then confusion matrix should have been 3x3. Could you check if all labels exist in `y_test` and `y_pred`? – Ala Tarighati Mar 05 '20 at 10:30
  • You say you have three classes, but it looks like your classifier only did two-class. Post us the code you used for training your classifier. In particular make sure you're passing the label column in correctly. – smci Mar 05 '20 at 10:31
  • You haven't posted the code for your classifier, so we can't solve why it only trained to two classes not three. SO rules require you to [Minimal, Complete Verifiable Example (MCVE)](https://stackoverflow.com/help/minimal-reproducible-example) – smci Mar 06 '20 at 01:44

2 Answers2

-1

This is the explanation of what you get.

For any NxN confusion matrix the following holds:

Actual: is the true labels

enter image description here


For a 2x2 case (simpler explanation):

enter image description here


Read also this: https://scikit-learn.org/stable/modules/model_evaluation.html#confusion-matrix

seralouk
  • 30,938
  • 9
  • 118
  • 133
-1

Let us take a closer look at the results:

tn, fp, fn, tp = np.array([[18, 10],
                           [ 7, 61]]).ravel()

Which means:

tn (True Negative) = 18
fp (False Positive) = 10
fn (False Negative) = 7 
tp (True Positive) = 61

In other words,

  • 18 class-0 samples classified (correctly) as class-0
  • 10 class-0 samples classified (incorrectly) as class-1
  • 7 class-1 samples classified (incorrectly) as class-0
  • 61 class-1 samples classified (correctly) as class-1

You could also directly use

tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
Ala Tarighati
  • 3,507
  • 5
  • 17
  • 34