I am preparing my data for training an image recognition model. I currently have one folder (the dataset) that contains multiple folders with the names of the labels and these folders have the images inside them.
I want to somehow split this dataset so that I have two main folders with the same subfolders, but the number of images inside these folders to be according to a preferred train/test split, so for instance 90% of the images in the train dataset and 10% in the test dataset.
I am struggling with finding the best way how to split my data. I have read a suggestion that pytorch torch.utils.Dataset class might be a way to do it but I can't seem to get it working as to preserve the folder hierarchy.