Dataset
from torch.utils.data
is an abstract class representing a dataset. Your custom dataset should inherit Dataset and override the following methods:
__len__()
so that len(dataset) returns the size of the dataset.
__getitem__()
to support the indexing such that dataset[i] can be used to get ith sample
Eg of writing custom Dataset
i have written a general custom dataloader for you as your problem statement.
here data.txt has data and label.txt has labels.
import torch
from torch.utils.data import Dataset
class CustomDataset(Dataset):
def __init__(self):
with open('data.txt', 'r') as f:
self.data_info = f.readlines()
with open('label.txt', 'r') as f:
self.label_info = f.readlines()
def __getitem__(self, index):
single_data = self.data_info[index].rstrip('\n')
single_label = self.label_info[index].rstrip('\n')
return ( single_data , single_label)
def __len__(self):
return len(self.data_info)
# Testing
d = CustomDataset()
print(d[1]) # should output data along with label
This will be a basic for your case but have to do some changes that matches your case.
Note : you have to make required changes as per your dataset