Data augmentation in validation

Question

I am a little bit confused about the data augmentation. If I perform data augmentation in train dataset, validation dataset should have the same operations? For example

data_transforms = {
'train': transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
'val': transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),}

Why do we take the 'resize' and 'CenterCrop' operations in 'val' dataset?

score 1 · Accepted Answer · answered Dec 19 '18 at 05:31

Since validation data is used to measure how good a trained model is, it should not be changed across different trained models. That is, we should use a fixed measure to evaluate things. This is the reason why the augmentation of validation data does not contain any randomness, which exist in the training data augmentation.

SIDE NOTE:

Unlike test data, validation data is used to tune the hyper parameters.

score 1 · Answer 2 · answered Jun 08 '21 at 09:48

I strongly disagree with the @Yashio Yamauchi's answer. Yes, data augmentation is commonly used in the training dataset to increase the number of the dataset samples, when the dataset is small. However, there are cases where your validation dataset is also small, so You cannot actually evaluate your model.

For example, let's say that your task is to recognise Logos on T-Shirts (e.g. Adidas Logo), no matter how they are rotated on an images (e.g. 90 degrees). Then, You will have to use data-augmentations to ensure that your model is fed with rotated t-shirts. However, If you want to measure how well your model identifies "Adidas" when it is rotated 90 degrees, then You will need to have images in your validation dataset that contain rotated t-shirts, as well.

In such case, Data Augmentations could be used once in the validation dataset, before the training happens!

Data augmentation in validation

2 Answers2