I am dealing with an imbalanced dataset and tried handle it with the validation metric.
In scikit docu I found the following for weighted
:
Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
Does calculating the average weighted by support mean, that the class with more samples are higher weighted than the ones with less samples or, as it seems to be more logical, that smaller classes are weighted more than bigger ones.
I couldn't find anything out it in the docu and wanted to make sure I am choosing the right metric.
Thanks!