-1

I have a CSV file with 2 columns as

actual,predicted
1,0
1,0
1,1
0,1
.,.
.,.

How do I read this file and plot a confusion matrix in Python? I tried the following code from a program.

import pandas as pd
from sklearn.metrics import confusion_matrix
import numpy

CSVFILE='./mappings.csv'
test_df=pd.read_csv[CSVFILE]

actualValue=test_df['actual']
predictedValue=test_df['predicted']

actualValue=actualValue.values
predictedValue=predictedValue.values

cmt=confusion_matrix(actualValue,predictedValue)
print cmt

but it gives me this error.

Traceback (most recent call last):
  File "confusionMatrixCSV.py", line 7, in <module>
    test_df=pd.read_csv[CSVFILE]
TypeError: 'function' object has no attribute '__getitem__'
uNIKx
  • 123
  • 2
  • 13

3 Answers3

1

pd.read_csv is a function. You call a function in Python by using parenthesis.

You should use pd.read_csv(CSVFILE) instead of pd.read_csv[CSVFILE].

Abhishek Sharma
  • 110
  • 1
  • 6
  • I am getting an error as: ValueError: Classification metrics can't handle a mix of multiclass and continuous targets – uNIKx Jun 13 '18 at 12:08
  • That should be asked in a different question. – Abhishek Sharma Jun 13 '18 at 12:11
  • I corrected the [ ] to pd.read_csv(CSVFILE). And after running I am getting that error, that's why thought to post the error here. Any idea why that's happening? – uNIKx Jun 13 '18 at 12:13
  • 1
    I solved the above error. The CSV File had few entries not as per format. It has to be like , ie 3,4. But few entries were , ie 3, . – uNIKx Jun 13 '18 at 12:31
1
import pandas as pd
from sklearn.metrics import confusion_matrix
import numpy as np

CSVFILE = './mappings.csv'
test_df = pd.read_csv(CSVFILE)

actualValue = test_df['actual']
predictedValue = test_df['predicted']

actualValue = actualValue.values.argmax(axis=1)
predictedValue  =predictedValue.values.argmax(axis=1)

cmt = confusion_matrix(actualValue, predictedValue)
print cmt
Nihal
  • 5,262
  • 7
  • 23
  • 41
  • I corrected but still I am getting an error as: `ValueError: Classification metrics can't handle a mix of multiclass and continuous targets ` – uNIKx Jun 13 '18 at 12:16
  • I above error is solved. The CSV File had few entries not as per format. It has to be like , ie 3,4. But few entries were , ie 3, . – uNIKx Jun 13 '18 at 12:36
1

Here's a simple solution to calculate the accuracy and plot confusion matrix for the input in the format mentioned in the question.

from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score

file=open("results.txt","r")

result=[]
actual=[]

i = 0

for line in file:
    i+=1
    sent=line.split("\t")
    sent[0]=int(sent[0])
    sent[1]=int(sent[1])
    result.append(sent[1])
    actual.append(sent[0])

cnf_mat=confusion_matrix(actual,result)
print cnf_mat

print('Test Accuracy:', accuracy_score(actual,result))
uNIKx
  • 123
  • 2
  • 13