2 facts:
As stated in other answers, Tensorflow built-in metrics precision and recall don't support multi-class (the doc says will be cast to bool
)
There are ways of getting one-versus-all scores by using precision_at_k by specifying the class_id
, or by simply casting your labels
and predictions
to tf.bool
in the right way.
Because this is unsatisfying and incomplete, I wrote tf_metrics
, a simple package for multi-class metrics that you can find on github. It supports multiple averaging methods like scikit-learn
.
Example
import tensorflow as tf
import tf_metrics
y_true = [0, 1, 0, 0, 0, 2, 3, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 2, 0, 3, 3, 1]
pos_indices = [1] # Metrics for class 1 -- or
pos_indices = [1, 2, 3] # Average metrics, 0 is the 'negative' class
num_classes = 4
average = 'micro'
# Tuple of (value, update_op)
precision = tf_metrics.precision(
y_true, y_pred, num_classes, pos_indices, average=average)
recall = tf_metrics.recall(
y_true, y_pred, num_classes, pos_indices, average=average)
f2 = tf_metrics.fbeta(
y_true, y_pred, num_classes, pos_indices, average=average, beta=2)
f1 = tf_metrics.f1(
y_true, y_pred, num_classes, pos_indices, average=average)