1

I have to calculate the false positive rate for multiclass classification using only numpy methods. I have two numpy arrays, one for the predictions ((m, k) shape: m is the count of sample elements and k is the count of categories) and another for the true labels ((m,) shape).

What I already did: determine the prediction (positive) element indeces for all the rows (prediction_labels array), making a set for the unique categories (true_labels).

What I want to do: iterate through the prediction_labels and the y_true arrays in the same time and count whether the given element (each unique value in the true_labels) is equal in the same position. So I want to determine the false positive counts by category in an array (false_positive_counts)

For example:

def false_positive_rate(y_pred, y_true):
    prediction_labels = np.argmax(y_pred, axis=1)
    true_labels = np.unique(y_true)
    false_positive_counts = ... # ?
    ...
    return fpr

y_pred = np.array([[1., 0., 0., 0.],
                   [1., 0., 0., 0.], 
                   [0., 0., 1., 0.],
                   [0., 0., 1., 0.],
                   [0., 1., 0., 0.],
                   [0., 0., 0., 1.],
                  ])  # [0,0,2,2,1,3]
y_true = np.array([0, 2, 1, 1, 1, 3])
print(false_positive_rate(y_pred, y_true))   # 3/20
Zoltán Orosz
  • 303
  • 1
  • 8

1 Answers1

0

You have positive and negative in predictions. But, there is no False in answers.

Therefore, your FPR is always 1.

def false_positive_rate(y_pred_raw, y_true):
    y_pred = np.argmax(y_pred_raw, axis=1)
    TP, FP, FN, TN = 0,0,0,0
    for pp, tt in zip(y_pred, y_true):
        if   pp==tt: TP+=1;
        elif pp!=tt: FP+=1;
        # there is no case for FN, TN
    print(f"TP={TP}, FP={FP}, FN={FN}, TN={TN}");
    FPR = FP/ (TN+FP);
    return FPR
nambee
  • 160
  • 1
  • 5