I am training a UNET for semantic segmentation but I only have 200 labeled images. Given the small size of the dataset, it definitely needs some data augmentation techniques.
I have question about the test and the validation set.
I have custom data generator which keep feeding data from folder for training model.
So what I plan to do is:
do data augmentation for the training set and keep all of it in the same folder
"randomly" pick some of training data into test and validation set (of course, before training).
I am not sure if this is fine, since we just do some simple processing (flipping, transposing, adjusting brightness)
Would it be better to separate the data first and do the augmentation for the rest of data in the training folder?