1

Having unbalanced data, how can I use ImageDataGenerator() to generate enough augmented data for shorter sample to balance all categories?

DragonKnight
  • 1,740
  • 2
  • 22
  • 35
  • 2
    I don't think you can do that with `ImageDataGenerator`. There is simply no built-in option for that. However, you can use `class_weights` in `ft` method to somehow makeup for the contribution of low-count classes. – today Apr 21 '20 at 18:38
  • @today would you mind to provide a simple example. – DragonKnight Apr 21 '20 at 20:00
  • This may be your solution. https://stackoverflow.com/questions/42586475/is-it-possible-to-automatically-infer-the-class-weight-from-flow-from-directory – CRich Feb 12 '21 at 05:33

2 Answers2

1

You can use the following code,

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

This will not affect your dataset at all. It formats the image while feeding into the model.
You may refer the documentation, Image Preprocessing
Hope this helps.

Harshit Ruwali
  • 1,040
  • 2
  • 10
  • 22
0

You need to create a dictionary based on the weights of each class and then feed the model.fit_generator with it:

from sklearn.utils import class_weight import numpy as np

class_weights = class_weight.compute_class_weight(
           'balanced',
            np.unique(train_generator.classes), 
            train_generator.classes)

train_class_weights = dict(enumerate(class_weights))
model.fit_generator(..., class_weight=train_class_weights)
Taisa
  • 123
  • 9