How to enlarge dataset using augmentations in Pytorch

Question

If i enlarge my dataset using augmentations, I get a better result?

For example, I have 1 class, it is a dog class and 4 images for it. I applied augmentations to 4 images. Now some of these images are augmented, some are not. But I still have 4 images.

Will it be more efficient if I add to augmented images original images? -> It will be 8 images in dataset. I tried to do this thing, changing my "Custom Dataset", but if I have lot of images (100000) then Collab tell me bye bye, because of memory ran out.

Is it matter to make augmentations before creating dataset and after creating dataset in training loop like this:

for x, y in train_loader:
    aug_x = aug(x)
    ...
    output = model(aug_x)
    loss = ...
    loss.backward()
    ...

I suppose, I need to choose 1 way to apply augmentations to my images either before dataset or in the training loop. Am I wrong? Write below ypur suggestions with code. Thank you!

score 1 · Answer 1 · answered Sep 08 '22 at 21:57

Usually approprietly chosen augmentations leads to better results. You are right the preliminary augmentation of your dataset and saving augmented images consumes all the disk memory in the case of big datasets. So it makes sense to apply augmentations dynamically, on-the-fly.

Simple pytorch example:


import cv2
import numpy as np
from torch.utils.data import DataLoader, Dataset


class MyDataset(Dataset):
    def __init__(self, image_paths, size):
        self._image_paths = image_paths
        self._size = size

    def __getitem__(self, idx):
        path = self._image_paths[0]
        image = cv2.imread(path)

        # Insert here your augmentations
        if np.random.rand() < 0.5:
            image = cv2.flip(image, 0)
        if np.random.rand() < 0.5:
            image = cv2.flip(image, 1)

        return image

    def __len__(self):
        return self._size
    

image_paths = ["1.png"]
loader = DataLoader(MyDataset(image_paths, 10), batch_size=4)
for batch in loader:
    batch_images = np.hstack([image for image in batch])

    cv2.imshow("image", batch_images)
    cv2.waitKey()

One special case when this approach will work poorly is when the augmentation process takes a lot of time. For example when you need to render some 3D objects using complex pipeline with Blender. Such augmentations will became the bottleneck during the training so it makes sense to save the augmented data to disk first and the use it to enlarge dataset during training.

The choice of augmentations heavily depends on the domain of your data. Small augmentations could lead to small to no accuracy gains. Very heavy augmentations could distort the training distribution too big which results to decrease in quality.

If you are interested in image augmentations you can check out these projects:

https://github.com/aleju/imgaug

https://github.com/albumentations-team/albumentations

https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html

How to enlarge dataset using augmentations in Pytorch

1 Answers1