The Keras documentation states:
Using sample weighting and class weighting
With the default settings the weight of a sample is decided by its frequency in the dataset. There are two methods to weight the data, independent of sample frequency:
-Class weights
-Sample weights
I have a significantly imbalanced dataset. I was looking at how to adjust for this, and came across a few answers here dealing with that, such as here and here. My plan was thus to create a dictionary object of the relative frequencies and pass them onto model.fit()
's class_weight
parameter.
However, now that I'm looking up the documentation, it seems as though class imbalance is already dealt with? So I don't necessarily have to manage for the different class counts after all?
For the record, here are the class counts:
0
: 25,811, 1
: 2,444, 2
: 5,293, 3
: 874, 4
: 709.
And here is the dictionary I was going to pass onto (pseudocode):
class_weight = {0: 1.0,
1: len(/0/)/len(/1/),
2: len(/0/)/len(/2/),
3: len(/0/)/len(/3/),
4: len(/0/)/len(/4/)}