1

Precision is obtained below by using keras library:

model.compile(optimizer='sgd',
          loss='mse',
          metrics=[tf.keras.metrics.Precision()])

What type of precision calculated by sklearn is equal to the precision calculated by keras?

precision_score(y_true, y_pred, average=???)
  1. macro
  2. micro
  3. weighted
  4. none

What happens when you set the zero_division to be 1 as in below?:

precision_score(y_true, y_pred, average=None, zero_division=1)
Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51

1 Answers1

4

TLDR; Default is binary for binary classification and micro for multi-class classification. Other average types such as None and macro can also be achieved with minor modification as explained below.


This should give you some clarity on the differences between tf.keras.Precision() and sklearn.metrics.precision_score(). Let's compare different scenarios.

Scenario 1: Binary classification

For binary classification, your y_true and y_pred are 0,1 and 0-1 respectively. The implementation for both is quite straight forward.

Sklearn documentation: Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

#Binary classification

from sklearn.metrics import precision_score
import tensorflow as tf

y_true = [0,1,1,1]
y_pred = [1,0,1,1]

print('sklearn precision: ',precision_score(y_true, y_pred, average='binary'))
#Only report results for the class specified by pos_label. 
#This is applicable only if targets (y_{true,pred}) are binary.

m = tf.keras.metrics.Precision()
m.update_state(y_true, y_pred)
print('tf.keras precision:',m.result().numpy())
sklearn precision:  0.6666666666666666
tf.keras precision: 0.6666667

Scenario 2: Multi-class classification (global precision)

Here you are working with multi-class labels but you are not bothered about how the Precision is for each individual class. You simply want a global set of TP and FP to calculate a total precision score. In sklearn this is set by the parameter micro, while in tf.keras this is the default setting for Precision()

Sklearn documentation: Calculate metrics globally by counting the total true positives, false negatives and false positives.

#Multi-class classification (global precision)

#3 classes, 6 samples
y_true = [[1,0,0],[0,1,0],[0,0,1],[1,0,0],[0,1,0],[0,0,1]]
y_pred = [[1,0,0],[0,0,1],[0,1,0],[1,0,0],[1,0,0],[0,1,0]]

print('sklearn precision: ',precision_score(y_true, y_pred, average='micro'))
#Calculate metrics globally by counting the total true positives, false negatives and false positives.

m.reset_states()
m = tf.keras.metrics.Precision()
m.update_state(y_true, y_pred)
print('tf.keras precision:',m.result().numpy())
sklearn precision:  0.3333333333333333
tf.keras precision: 0.33333334

Scenario 3: Multi-class classification (binary precision for each label)

You are interested in this scenario if you want to know the precision for each individual class. In sklearn this is done by setting the average parameter to None, while in tf.keras you will have to instantiate the object for each individual class separately using class_id.

Sklearn documentation: If None, the scores for each class are returned.

#Multi-class classification (binary precision for each label)

#3 classes, 6 samples
y_true = [[1,0,0],[0,1,0],[0,0,1],[1,0,0],[0,1,0],[0,0,1]]
y_pred = [[1,0,0],[0,0,1],[0,1,0],[1,0,0],[1,0,0],[0,1,0]]

print('sklearn precision: ',precision_score(y_true, y_pred, average=None))
#If None, the scores for each class are returned.

#For class 0
m0 = tf.keras.metrics.Precision(class_id=0)
m0.update_state(y_true, y_pred)

#For class 1
m1 = tf.keras.metrics.Precision(class_id=1)
m1.update_state(y_true, y_pred)

#For class 2
m2 = tf.keras.metrics.Precision(class_id=2)
m2.update_state(y_true, y_pred)

mm = [m0.result().numpy(), m1.result().numpy(), m2.result().numpy()]

print('tf.keras precision:',mm)
sklearn precision:  [0.66666667 0.         0.        ]
tf.keras precision: [0.6666667, 0.0, 0.0]

Scenario 4: Multi-class classification (Average of individual binary scores)

Once you have calculated the individual precision for each class, you may want to take the average score (or weighted average). In sklearn, a simple average of the individual scores is taken by setting the parameter average to macro. In tf.keras you can get the same result by taking an average of the individual precisions as calculated in the scenario above.

Sklearn documentation: Calculate metrics for each label, and find their unweighted mean.

#Multi-class classification (Average of individual binary scores)

#3 classes, 6 samples
y_true = [[1,0,0],[0,1,0],[0,0,1],[1,0,0],[0,1,0],[0,0,1]]
y_pred = [[1,0,0],[0,0,1],[0,1,0],[1,0,0],[1,0,0],[0,1,0]]

print('sklearn precision (Macro): ',precision_score(y_true, y_pred, average='macro'))
print('sklearn precision (Avg of None):' ,np.average(precision_score(y_true, y_pred, average=None)))

print(' ')

print('tf.keras precision:',np.average(mm)) #mm is list of individual precision scores
sklearn precision (Macro):  0.2222222222222222
sklearn precision (Avg of None):  0.2222222222222222
 
tf.keras precision: 0.22222222

NOTE: Remember, with sklearn, you have models that are predicting labels directly and the precision_score is a standalone method. Therefore, it can operate directly on a list of labels for predicted and actuals. However, tf.keras.Precision() is a metric that has to be applied over a binary or multi-class dense output. It will NOT be able to work with labels directly. You will have to give it an n-length array for each sample, where n is the number of classes/output dense nodes.

Hope this clarifies how the 2 are different in various scenarios. Please find more details in the sklearn documentation and the tf.keras documentation.


Your second question -

As per the sklearn documentation,

zero_division - “warn”, 0 or 1, default=”warn”
#Sets the value to return when there is a zero division. If set to “warn”, #this acts as 0, but warnings are also raised.

This is an exception handling flag. During the calculation of the score, if there comes a time when it encounters a divide by zero, it will consider it to be equal to zero and warn. Else, set it to 1, if set explicitly as 1.

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51