I want to apply data augmentation to my images to help with over-fitting. In my training dataset, there are 25 classes and 7982 images.
I loaded the traning images using the code below:
Loading data
train_images = image_generator.flow_from_dataframe(
dataframe=image_df,
x_col='Filepath',
y_col='Label',
target_size=(224, 224),
color_mode='rgb',
class_mode='categorical',
batch_size=32,
shuffle=True,
seed=42,
subset='training'
)
and it returns Found 7982 validated image filenames belonging to 25 classes.
Now I want to add image augmentation to effectively increase the number of images in the training dataset. To implement the data augmentation I have used the code below:
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
Applying data augmentation to the training dataset.
train_images_aug = train_datagen.flow_from_dataframe(
dataframe=image_df,
x_col='Filepath',
y_col='Label',
target_size=(224, 224),
color_mode='rgb',
class_mode='categorical',
batch_size=32,
shuffle=True,
seed=42,
subset='training'
)
which returns Found 7982 validated image filenames belonging to 25 classes.
I am not to sure if I should have more training images in my training_images_aug file compared to the standard training dataset?