I was looking for a good error metric for multiclass classifiers, and many people say that F1 measure is usually used. But given that predictions of multiclass classifiers are one-hot vectors, doesn't it mean there are no true positives when the prediction is wrong? What i mean is:
when the prediction is correct, every element is true negatives except for the single '1'. So the precision here is just 1.
And when the prediction is incorrect, there is no true positives. So the precision is 0.
I would understand that F1 is a powerful metric method when it comes to multilabel classifications, since there can be more than one 1's in the vector, but applying F1 on multiclass classification seems a bit weird to me. Isn't it same with just accuracy? Or does it mean that F1 score per class should be used?