[![enter image description here][1]][1]What reason could be for the F1 score that was not a harmonic mean of precision and recall with macro-average weighted equally for multi-class? My dataset is imbalanced, and the predictions are skewed.
Asked
Active
Viewed 310 times
0
-
1Not a programming question, hence arguably off-topic here; better suited for [Cross Validated](https://stats.stackexchange.com/help/on-topic). – desertnaut Feb 02 '19 at 19:07
1 Answers
2
A macro F1 calculates metrics for each label and finds their unweighted mean. Means that it doesn't take class imbalance into account whereas, a weighted macro F1 calculates metrics for each label and finds their average weighted by the number of instances of each label. Hence, it accounts for class imbalance and can have a score not between precision and recall.
For an example of weighted F1, refer to this answer Sandeep.

Raunaq Jain
- 917
- 7
- 13