I have trained a CNN to classify images into 5 classes. But when I try to plot ROC
curve for each class versus the rest, all 5 classes have almost a diagonal curve with AUC
of around 0.5. I have no idea what has gone wrong.
The model should have an accuracy of around 86%.
Here is the code:
import os, shutil
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.keras import models, layers, optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import plot_confusion_matrix, accuracy_score
from sklearn.metrics import roc_curve, auc, roc_auc_score, RocCurveDisplay
from sklearn.preprocessing import label_binarize
import random
model = tf.keras.models.load_model('G:/Myxoid lesion/Myxoid_EN3_finetune4b')
model.summary()
data_dir='G:/Myxoid lesion/Test/'
batch_size = 64
img_height = 300
img_width = 300
test_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
seed = 123,
image_size=(img_height, img_width),
batch_size=batch_size)
model.compile(optimizer = optimizers.Adam(lr=0.00002),
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics = ['sparse_categorical_accuracy'])
correct = np.array([], dtype='int32')
# Get the labels of test_ds
for x, y in test_ds:
correct = np.concatenate([correct, y.numpy()])
# Get the prediction probabilities for each class for each test image
prediction_prob = tf.nn.softmax(model.predict(test_ds))
num_class = 5
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(num_class):
fpr[i], tpr[i], _ = roc_curve(correct, prediction_prob[:,i], pos_label=i)
roc_auc[i] = auc(fpr[i], tpr[i])
plt.figure()
lw = 2
for i in range(num_class):
plt.plot(fpr[i],tpr[i],
color=(random.random(),random.random(),random.random()),
label='{0} (AUC = {1:0.2f})'''.format(labels[i], roc_auc[i]))
plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.legend(loc="lower right")
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC analysis')
plt.show()
The "prediction_prob" variable contains:
array([[6.3877934e-09, 6.3617526e-06, 5.5736535e-07, 4.9789862e-05,
9.9994326e-01],
[6.5260068e-08, 8.8882577e-03, 3.9350948e-06, 9.9110776e-01,
4.0252076e-11],
[2.7514220e-04, 2.9315910e-05, 1.6688553e-04, 9.9952865e-01,
3.5938730e-10],
...,
[1.1131389e-09, 9.8325908e-01, 3.4283744e-06, 1.6737511e-02,
7.3243338e-12],
[1.4697845e-08, 4.7125661e-05, 1.4077022e-03, 6.4052530e-02,
9.3449265e-01],
[9.9999940e-01, 1.3071107e-07, 4.3149896e-07, 4.7902233e-08,
9.2861301e-09]], dtype=float32)>
While the "correct" variable contains the correct label for each test image:
array([0, 1, 4, ..., 4, 2, 4])
I think I follow what is mentioned on the scikit-learn
website.
The tpr[i] and fpr[i] variables generated becomes linear correlated, so the AUC becomes 0.5
I think there is a problem in generating tpr[i] and fpr[i]? Could anyone figure out the problem?
Thanks!