I am trying to use F1 scores for model selection in multiclass classification.
I am calculating them class-wise and average over them: (F1(class1)+F1(class1)+F1(class1))/3 = F1(total)
However, in some cases I get NaN values for the F1 score. Here is an example: Let true_label = [1 1 1 2 2 2 3 3 3] and pred_label = [2 2 2 2 2 2 3 3 3].
Then the confusion matrix looks like: C =[0 3 0; 0 3 0; 0 0 3]
Which means when I calculate the precision (to calculate the F1 score) for the first class, I obtain: 0/(0+0+0), which is not defined or NaN.
Firstly, am I making a mistake in calculating F1 scores or precisions here? Secondly, how should I treat these cases in model selection? Ignore them or should I just set the F1 scores for this class to 0 (reducing the total F1 score for this model).
Any help would be greatly appreciated!