I am trying to compute a confusion matrix for my object detection model. However, I seem to stumble across some pitfalls. My current approach is to compare each predicted box with each ground truth box. If they have an IoU > some threshold, I insert the predictions into the confusion matrix. After the insertion, I delete the element in the predictions list and move on to the next element.
Because I also want the misclassified proposals to be inserted in the confusion matrix, I treat the elements with IoU lower than the threshold as confusion with the background. My current implementation looks like this:
def insert_into_conf_m(true_labels, predicted_labels, true_boxes, predicted_boxes):
matched_gts = []
for i in range(len(true_labels)):
j = 0
while len(predicted_labels) != 0:
if j >= len(predicted_boxes):
break
if bb_intersection_over_union(true_boxes[i], predicted_boxes[j]) >= 0.7:
conf_m[true_labels[i]][predicted_labels[j]] += 1
del predicted_boxes[j]
del predicted_labels[j]
else:
j += 1
matched_gts.append(true_labels[i])
if len(predicted_labels) == 0:
break
# if there are ground-truth boxes that are not matched by any proposal
# they are treated as if the model classified them as background
if len(true_labels) > len(matched_gts):
true_labels = [i for i in true_labels if not i in matched_gts or matched_gts.remove(i)]
for i in range(len(true_labels)):
conf_m[true_labels[i]][0] += 1
# all detections that have no IoU with any groundtruth box are treated
# as if the groundtruth label for this region was Background (0)
if len(predicted_labels) != 0:
for j in range(len(predicted_labels)):
conf_m[0][predicted_labels[j]] += 1
The row-normalized matrix looks like this:
[0.0, 0.36, 0.34, 0.30]
[0.0, 0.29, 0.30, 0.41]
[0.0, 0.20, 0.47, 0.33]
[0.0, 0.23, 0.19, 0.58]
Is there a better way to generate the confusion matrix for an object detection system? Or any other metric that is more suitable?