0

I am using TensorFlow-Slim and I've added some code lines in eval_image_classifier.py (located in /models/slim/) for computing TP, TN, FP and FN. However, computing Accuracy = (TP + TN) / (TP + FP + FN + TN) is not equal to the accuracy given by slim.metrics.streaming_accuracy(predictions, labels).

I have changed standard code from this:

names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
        'Recall_5': slim.metrics.streaming_recall_at_k(
            logits, labels, 5),
    })

...to this:

names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
        'Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
        'TruePositives': slim.metrics.streaming_true_positives(predictions, labels),        
        'TrueNegatives': slim.metrics.streaming_true_negatives(predictions, labels),
        'FalsePositives': slim.metrics.streaming_false_positives(predictions, labels),
        'FalseNegatives': slim.metrics.streaming_false_negatives(predictions, labels),
        'Recall_5': slim.metrics.streaming_recall_at_k(
        logits, labels, 5),
    })

Output:

I tensorflow/core/kernels/logging_ops.cc:79] eval/TruePositives[322]
I tensorflow/core/kernels/logging_ops.cc:79] eval/TrueNegatives[72]
I tensorflow/core/kernels/logging_ops.cc:79] eval/FalsePositives[4]
I tensorflow/core/kernels/logging_ops.cc:79] eval/FalseNegatives[2]
I tensorflow/core/kernels/logging_ops.cc:79] eval/Accuracy[0.9525]
I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_5[1]

I've tested the finetune_resnet_v1_50_on_flowers.sh script (located in /models/slim/scripts) without changing anything (cloned on 12.04.2017).

I'm not able to find my mistake. I would be very pleased to receive your answers, opinions or concrete proposals to this problem.

CUDA version: release 8.0, V8.0.53

TensorFlow installed from binary, tested versions: 1.0.1 and 1.1.0rc1

GPU: NVIDIA Tesla P100 (SXM2)

Elternhaus
  • 45
  • 2
  • 9
  • Flowers is not binary classification, is it? If not, could you please [file a bug on Github](https://github.com/tensorflow/tensorflow/issues) requesting that an error be thrown when TP/TN/FP/FN are used in multi-class classification? – Allen Lavoie Apr 17 '17 at 16:34
  • It isn't binary. I also think that Github is the best place for this possible bug report. Therefore, I've opened an issue in the models section (where slim is placed). Unfortunately, it was closed with the suggestion to ask this on StackOverflow (https://github.com/tensorflow/models/issues/1337). – Elternhaus Apr 18 '17 at 14:31
  • To be clear, the fact that TP/TN/FP/FN don't add up to accuracy for multi-class classification isn't a bug (they don't have any real meaning there). The thing that would make a good feature request is that an error be thrown when they're used in multi-class classification. – Allen Lavoie Apr 18 '17 at 15:32

0 Answers0