Having unbalanced data, how can I use ImageDataGenerator()
to generate enough augmented data for shorter sample to balance all categories?
Asked
Active
Viewed 1,168 times
1

DragonKnight
- 1,740
- 2
- 22
- 35
-
2I don't think you can do that with `ImageDataGenerator`. There is simply no built-in option for that. However, you can use `class_weights` in `ft` method to somehow makeup for the contribution of low-count classes. – today Apr 21 '20 at 18:38
-
@today would you mind to provide a simple example. – DragonKnight Apr 21 '20 at 20:00
-
This may be your solution. https://stackoverflow.com/questions/42586475/is-it-possible-to-automatically-infer-the-class-weight-from-flow-from-directory – CRich Feb 12 '21 at 05:33
2 Answers
1
You can use the following code,
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
This will not affect your dataset at all. It formats the image while feeding into the model.
You may refer the documentation, Image Preprocessing
Hope this helps.

Harshit Ruwali
- 1,040
- 2
- 10
- 22
0
You need to create a dictionary based on the weights of each class and then feed the model.fit_generator with it:
from sklearn.utils import class_weight import numpy as np
class_weights = class_weight.compute_class_weight(
'balanced',
np.unique(train_generator.classes),
train_generator.classes)
train_class_weights = dict(enumerate(class_weights))
model.fit_generator(..., class_weight=train_class_weights)

Taisa
- 123
- 9