In my custom dataset, one kind of image is in one folder which torchvision.datasets.Imagefolder can handle, but how to split the dataset into train and test?
Asked
Active
Viewed 8,563 times
8

Shai
- 111,146
- 38
- 238
- 371

toy_programmer
- 91
- 1
- 4
-
Does this answer your question? [How do I split a custom dataset into training and test datasets?](https://stackoverflow.com/questions/50544730/how-do-i-split-a-custom-dataset-into-training-and-test-datasets) – Aray Karjauv May 24 '22 at 15:16
1 Answers
12
You can use torch.utils.data.Subset
to split your ImageFolder
dataset into train and test based on indices of the examples.
For example:
orig_set = torchvision.datasets.Imagefolder(...) # your dataset
n = len(orig_set) # total number of examples
n_test = int(0.1 * n) # take ~10% for test
test_set = torch.utils.data.Subset(orig_set, range(n_test)) # take first 10%
train_set = torch.utils.data.Subset(orig_set, range(n_test, n)) # take the rest

Shai
- 111,146
- 38
- 238
- 371
-
-
1@toy_programmer you might need to be more careful I. the way you select indices for train/test – Shai Jul 29 '19 at 07:32
-
5Agree with @Shai. If the data is ordered you will take only the first 10% of the classes – Enoon May 13 '20 at 09:18
-
better use [`random_split`](https://pytorch.org/docs/stable/data.html?highlight=random_split#torch.utils.data.random_split) – Aray Karjauv May 24 '22 at 15:18