1

I'm trying to create a custom pytorch dataset to plug into DataLoader that is composed of single-channel images (20000 x 1 x 28 x 28), single-channel masks (20000 x 1 x 28 x 28), and three labels (20000 X 3).

Following the documentation, I thought I would test creating a dataset with a single-channel image and a single-channel mask, using the following code:

class CustomDataset(Dataset):

    def __init__(self, images, masks, transforms=None, train=True): 
        self.images = images
        self.masks = masks
        self.transforms = transforms

    def __getitem__(self, index):
        image = self.images.iloc[index, :]
        image = np.asarray(image).astype(np.float64).reshape(28, 28, 1)
        mask = self.masks.iloc[index, :]
        mask = np.asarray(mask).astype(np.float64).reshape(28, 28, 1)
        transformed_image = self.transforms(image)
      
        return transformed_image, mask

    def __len__(self):  

       return len(self.images)

Using the class, I form the dataset from two pandas dataframes and plug into DataLoader.

transform = transforms.Compose(
    [transforms.ToPILImage(),
     transforms.ToTensor(),
     transforms.Normalize((0.5, ), (0.5, ))
])

train_images = pd.read_csv('train.csv')
train_masks = pd.read_csv('masks.csv')

train_data = CustomDataset(train_images, train_masks, transform)
trainloader = DataLoader(train_data, batch_size=128, shuffle=True)

I would expect the shape of a single batch in trainloader to be ([128, 1, 28, 28], [128, 1, 28, 28]), for both the image on the left and the mask on the right.

Instead the shape of a single batch trainloader is ([128, 1, 28, 28], [128]), which makes me think that the masks have somehow been transformed into labels.

How do I fix this, and how do I add in the actual labels in addition to a mask? Thanks in advance for your help!

morepenguins
  • 1,187
  • 10
  • 21

1 Answers1

0

Perhaps you need to apply the transform on the mask too (excluding normalization). Like

class CustomDataset(Dataset):

def __init__(self, images, masks, transforms_image=None, transforms_mask=None, train=True): 
    self.images = images
    self.masks = masks
    self.transforms_image = transforms_image
    self.transforms_mask = transforms_mask
def __getitem__(self, index):
    image = self.images.iloc[index, :]
    image = np.asarray(image).astype(np.float64).reshape(28, 28, 1)
    mask = self.masks.iloc[index, :]
    mask = np.asarray(mask).astype(np.float64).reshape(28, 28, 1)
    transformed_image = self.transforms(image)
  
    return transformed_image, mask

def __len__(self):  

   return len(self.images)

and

   transform_image = transforms.Compose(
    [transforms.ToPILImage(),
     transforms.ToTensor(),
     transforms.Normalize((0.5, ), (0.5, ))
])

   transform_mask = transforms.Compose(
    [transforms.ToPILImage(),
     transforms.ToTensor()
])
train_images = pd.read_csv('train.csv')
train_masks = pd.read_csv('masks.csv')

train_data = CustomDataset(train_images, train_masks, transform_image, transform_mask)
trainloader = DataLoader(train_data, batch_size=128, shuffle=True)
navneeth
  • 1,239
  • 12
  • 21