Convert Dataloader Dictionary to Pytorch Tensor

Question

I am trying to apply an autoencoder to a custom dataset in Pytorch using the following tutorial as a guide.

https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

My dataset consists of images and values associated with those images.

I run into issues when trying to train the model because the items in the dataset I created act like a dictionary and the model wants tensors.

dataloader type =  <class 'torch.utils.data.dataloader.DataLoader'>

For item in dataloader:

<class 'dict'>
{'image': tensor([[[[0.5270, 0.5667, 0.6228,  ..., 0.6588, 0.6748, 0.6787],
          [0.5551, 0.5591, 0.6049,  ..., 0.6441, 0.6659, 0.6718],
          [0.4706, 0.4230, 0.5051,  ..., 0.6439, 0.6608, 0.6627],
          ...,
          [0.6914, 0.6478, 0.6574,  ..., 0.5968, 0.5735, 0.5676],
          [0.6814, 0.6475, 0.6713,  ..., 0.5664, 0.5059, 0.4669],
          [0.6588, 0.6623, 0.6914,  ..., 0.4667, 0.3453, 0.3132]],

         [[0.5061, 0.5270, 0.6422,  ..., 0.6642, 0.6787, 0.6931],
          [0.5375, 0.5426, 0.6169,  ..., 0.6520, 0.6711, 0.6730],
          [0.3980, 0.3314, 0.4326,  ..., 0.6608, 0.6738, 0.6716],
          ...,
          [0.7196, 0.6326, 0.6333,  ..., 0.6000, 0.5814, 0.5801],
          [0.6877, 0.6005, 0.6537,  ..., 0.5368, 0.4382, 0.3944],
          [0.6828, 0.6755, 0.6983,  ..., 0.3880, 0.2973, 0.2806]],

         [[0.5456, 0.5431, 0.5968,  ..., 0.6657, 0.6650, 0.6735],
          [0.4939, 0.3998, 0.4319,  ..., 0.6650, 0.6765, 0.6676],
          [0.4272, 0.2904, 0.3010,  ..., 0.6618, 0.6745, 0.6583],
          ...,
          [0.7324, 0.6740, 0.6505,  ..., 0.5436, 0.4958, 0.4971],
          [0.6939, 0.6836, 0.6887,  ..., 0.4218, 0.3672, 0.3875],
          [0.6669, 0.6686, 0.6826,  ..., 0.3093, 0.3108, 0.3341]],

         [[0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961],
          [0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961],
          [0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961],
          ...,
          [0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961],
          [0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961],
          [0.9961, 0.9961, 0.9961,  ..., 0.9961, 0.9961, 0.9961]]]],
       dtype=torch.float64), 'hardness_arr': tensor([[[ 54.1600,   8.1600],
         [  7.4000, -38.6000]]], dtype=torch.float64)}

Can I covert from a dataset dictionary to a tensor, or is there a better way of approaching the issue?

    autoencoder = AutoEncoder()

optimizer = torch.optim.Adam(autoencoder.parameters(), lr=LR)
loss_func = nn.MSELoss()

for epoch in range(EPOCH):
    for step, (dataloader['image'], dataloader['hardness_arr']) in enumerate(dataloader['image']):
        b_x = x['image'].view(-1, crop_size*crop_size)   # batch x, shape (batch, 28*28)
        b_y = x['image'].view(-1, crop_size*crop_size)   # batch y, shape (batch, 28*28)

        encoded, decoded = autoencoder(b_x)

        loss = loss_func(decoded, b_y)      # mean square error
        optimizer.zero_grad()               # clear gradients for this training step
        loss.backward()                     # backpropagation, compute gradients
        optimizer.step()                    # apply gradients

TypeError                                 Traceback (most recent call last)
<ipython-input-82-25093ba202c6> in <module>
      5 
      6 for epoch in range(EPOCH):
----> 7     for step, (dataloader['image'], dataloader['hardness_arr']) in enumerate(dataloader['image']):
      8         b_x = x['image'].view(-1, crop_size*crop_size)   # batch x, shape (batch, 28*28)
      9         b_y = x['image'].view(-1, crop_size*crop_size)   # batch y, shape (batch, 28*28)

TypeError: 'DataLoader' object is not subscriptable

Alternatively, if I try to enumerate over the dataloader itself I receive the following error.

TypeError: 'DataLoader' object does not support item assignment

You should use `for step, x in enumerate(dataloader):` then you can access your tensors like `x['image']`. The code you posted to iterate through your dataloader doesn't look like valid python to me. It's possible that the format of the dataset items isn't consistent with the default collate function used by DataLoader. You could provide a custom [collate function](https://pytorch.org/docs/stable/data.html#working-with-collate-fn) if you need to. — jodag, Oct 17 '19 at 20:59
@jodag This is much appreciated. I was able to run the model without error by using "for step, x in enumerate(dataloader):" Now I just have to make sure the model was actually doing what I expected :) — Random Engineer, Oct 20 '19 at 19:02

Convert Dataloader Dictionary to Pytorch Tensor

0 Answers0