0

I want to apply data augmentation to my images to help with over-fitting. In my training dataset, there are 25 classes and 7982 images.

I loaded the traning images using the code below:

Loading data

train_images = image_generator.flow_from_dataframe(
    dataframe=image_df,
    x_col='Filepath',
    y_col='Label',
    target_size=(224, 224),
    color_mode='rgb',
    class_mode='categorical',
    batch_size=32,
    shuffle=True,
    seed=42,
    subset='training'
)

and it returns Found 7982 validated image filenames belonging to 25 classes.

Now I want to add image augmentation to effectively increase the number of images in the training dataset. To implement the data augmentation I have used the code below:

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')
Applying data augmentation to the training dataset.
train_images_aug = train_datagen.flow_from_dataframe(
    dataframe=image_df,
    x_col='Filepath',
    y_col='Label',
    target_size=(224, 224),
    color_mode='rgb',
    class_mode='categorical',
    batch_size=32,
    shuffle=True,
    seed=42,
    subset='training'
) 

which returns Found 7982 validated image filenames belonging to 25 classes.

I am not to sure if I should have more training images in my training_images_aug file compared to the standard training dataset?

  • 1
    It actually does not add new images to the original dataset, instead it performs transformations on the batches of data while training. – Frightera Feb 26 '23 at 11:32

0 Answers0