I have a custom Dataset
that loads data from large files. Sometimes, the loaded data are empty and I don't want to use them for training.
In Dataset
I have:
def __getitem__(self, i):
(x, y) = self.getData(i) #getData loads data and handles problems
return (x, y)
which in case of bad data return (None, None)
(x
and y
are both None
). However, it later fails in DataLoader
and I am not able to skip this batch entirely. I have the batch size set to 1
.
trainLoader = DataLoader(trainDataset, batch_size=1, shuffle=False)
for x_batch, y_batch in trainLoader:
#process and train