-2

I am confused about the definition of data augmentation. Should we train the original data points and the transformed ones or just the transformed? If we train both, then we will increase the size of the dataset while the second approach won't.

I got this question when using the function RandomResizedCrop.

'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),

If we resize and crop some of the dataset randomly, we don't actually increase the size of the dataset for data augmentation. Is that correct? Or data augmentation just requires the change/modification of original dataset rather than increase the size of it?

Thanks.

kmario23
  • 57,311
  • 13
  • 161
  • 150
rio
  • 9
  • 3

2 Answers2

0

transform.compose is just as preprocessing an image ,convert one form to particular suiatable form

On apply transformation on single image means changing it pixel value and it does not increases dataset size

for more dataset you have to perform operation such below:

final_train_data = []
final_target_train = []
for i in tqdm(range(train_x.shape[0])):
    final_train_data.append(train_x[i])
    final_train_data.append(rotate(train_x[i], angle=45, mode = 'wrap'))
    final_train_data.append(np.fliplr(train_x[i]))
    final_train_data.append(np.flipud(train_x[i]))
    final_train_data.append(random_noise(train_x[i],var=0.2**2))
for j in range(5):
    final_target_train.append(train_y[i]) 

for more detail pytorch

Welcome_back
  • 1,245
  • 12
  • 18
  • Thanks. I found another common way to concatenate a list of datasets in Pytorch using torch.utils.data.ConcatDataset([train_dataset, train_dataset2]). – rio Apr 13 '20 at 05:13
0

By definition, or at least the influential paper AlexNet from 2012 that popularized data augmentation in computer vision, increases the size of the training set. Hence the word augmentation. Go ahead and have a look at Section 4.1 from the AlexNet paper. But, here is the gist of it, which I'm quoting from the paper:

The easiest and most common method to reduce overfitting on image data is to artificially enlarge the dataset using label-preserving transformations. The first form of data augmentation consists of generating image translations and horizontal reflections. We do this by extracting random 224 × 224 patches (and their horizontal reflections) from the 256×256 images and training our network on these extracted patches. This increases the size of our training set by a factor of 2048, though the resulting training examples are, of course, highly interdependent.

As for the specific implementation, it depends on your use case and most importantly the size of your training data. If you're short in quantity in the latter case, you should consider training on original and transformed images, by sufficiently taking care to preserve the labels.

kmario23
  • 57,311
  • 13
  • 161
  • 150
  • The approach in that paper is "resize". However, we have different methods of data augmentation. So do we have a preference order of the methods for different types of images? Thx. – rio Apr 13 '20 at 05:23