Image data augmentation

Question

I am doing a computer vision project and I need to apply data augmentation. I have 3 classes : two classes with 500 images and a class with 1000 images. I am going to generate multiple versions of the images with data augmentation, should I apply for example 3 random transformations on the two first classes to have in total 2000 images and apply just one transformation on the final class to have 2000 total classes? Finally should the data augmentation be applied on the whole dataset then separate it into train and test or separate it then apply the augmentation on the train dataset. Thank you

Definitely do the train-test split and then augment the train data. Never do anything with the test data (except pre-processing). — sourvad, Mar 23 '21 at 17:06

score 0 · Accepted Answer · answered Mar 23 '21 at 17:06

0

Data Augmentation is applied to only training set. Don't touch test set.

Apply augmentation randomly in training. So a particular image may or may not be augmented in a particular epoch.

No need to treat classes separately to deal with class-imbalance. Class imbalance is handled with appropriate loss functions such as cross-entropy or focal loss function in retinanet.

answered Mar 23 '21 at 17:06

Abhi25t

3,703
3
19
32

Thank you for your answer @Abhi25t it's really helpful, I am going to use YOLOv4 so I can't really choose the loss function even for data augmentation I need to apply it before training . (I'm a total beginner). – SSSSSSSSS Mar 24 '21 at 08:20
Yolov4 paper does discuss about benefits of handling class imbalance by focal loss function (section 2.2) and they go on to describe how CIOU loss function is better suited for that problem. I'd suggest you to go ahead with the default settings first and then look at the problems if any. You have to start somewhere. In Deep Learning, many default settings are good enough. If not, tweaking often works, but you need to identify the problems with default settings for that. – Abhi25t Mar 24 '21 at 11:59

score -1 · Answer 2 · answered Jun 27 '23 at 12:41

In order to have a proper training, you should divide the database in 3 sets: a training set, a validation set, and a test set. The test set is your golden standard and you should not touch it during the training. You use it in inference time, when you compute the metrics.

The validation set is a sort of "support set" during the training. You use it to optimize hyperparameters like the learning rate or the batch size.

A possible algorithm can be:

Split your database in 3 sets with 70% of images in the training set, 15% in the validation set, and 15% in the test set. Each class should be represented in each set.
Determine the best hyperparameters training on the training set and validating on the validation set. This will reduce the possibility of overfitting.
Retrain the model on the training+validation set using the hyperparameters of point 2 and evaluate on the test set.

Data augmentation/oversampling should be applied only on the training set.
3. They are used to generalize your model, i.e. in this case learn general patterns like the cat ears or the dog nose in a dogs vs cats image classification problem.

Image data augmentation

2 Answers2