21

I have list of tensor where each tensor has a different size. How can I convert this list of tensors into a tensor using PyTorch?

For instance,

x[0].size() == torch.Size([4, 8])
x[1].size() == torch.Size([4, 7])  # different shapes!

This:

torch.tensor(x)

Gives the error:

ValueError: only one element tensors can be converted to Python scalars

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Omar Abdelaziz
  • 477
  • 1
  • 6
  • 16
  • 1
    Please provide more of your code. – Fábio Perez Mar 07 '19 at 21:52
  • for item in features: x.append(torch.tensor((item))) – Omar Abdelaziz Mar 07 '19 at 23:46
  • this gives me a list of tensors but each tensor have different size so when I try torch.stack(x) it gives me the same error @FábioPerez – Omar Abdelaziz Mar 07 '19 at 23:48
  • your question need more be detailed...where this data came from, if from image, just resize all image in the same shape, before convert to torch type. it more easier to anyone here help if you explain it more detailed in processing the data.. at least we can give strategy advise to process the data – Wahyu Bram Jun 12 '20 at 19:18
  • Ran into this problem trying `torch.tensor(torch.split(sometuple))`. Same applies... variable length doesn't work out. – rocksNwaves Dec 09 '20 at 17:10
  • if you are interested in the case with same lengths here is the solution: https://stackoverflow.com/questions/54307225/whats-the-difference-between-torch-stack-and-torch-cat-functions/66036075#66036075 – Charlie Parker Feb 09 '21 at 20:01
  • there is also a solution I believe for your question in the pytorch forum also: https://discuss.pytorch.org/t/nested-list-of-variable-length-to-a-tensor/38699/21 – Charlie Parker Feb 09 '21 at 20:01

2 Answers2

20

You might be looking for cat.

However, tensors cannot hold variable length data.

for example, here we have a list with two tensors that have different sizes(in their last dim(dim=2)) and we want to create a larger tensor consisting of both of them, so we can use cat and create a larger tensor containing both of their data.

also note that you can't use cat with half tensors on cpu as of right now so you should convert them to float, do the concatenation and then convert back to half

import torch

a = torch.arange(8).reshape(2, 2, 2)
b = torch.arange(12).reshape(2, 2, 3)
my_list = [a, b]
my_tensor = torch.cat([a, b], dim=2)
print(my_tensor.shape) #torch.Size([2, 2, 5])

you haven't explained your goal so another option is to use pad_sequence like this:

from torch.nn.utils.rnn import pad_sequence
a = torch.ones(25, 300)
b = torch.ones(22, 300)
c = torch.ones(15, 300)
pad_sequence([a, b, c]).size() #torch.Size([25, 3, 300])

edit: in this particular case, you can use torch.cat([x.float() for x in sequence], dim=1).half()

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Separius
  • 1,226
  • 9
  • 24
  • 1
    Hey Separius thanks for the answer but could u explain what dim is and How should I set it according to? – Omar Abdelaziz Mar 08 '19 at 19:20
  • also I have a list of tensor sorted descendly and the shape of the first tensor is torch.Size([76080, 38]) – Omar Abdelaziz Mar 08 '19 at 19:25
  • the shape of the other tensors would differ in the second element for example the second tensor in the list is torch.Size([76080, 36]) – Omar Abdelaziz Mar 08 '19 at 19:26
  • when I remove the dim I get this error RuntimeError: _th_cat is not implemented for type torch.HalfTensor – Omar Abdelaziz Mar 08 '19 at 19:28
  • I tried to use pad sequence but it gives me this error TypeError: _pack_padded_sequence(): argument 'input' (position 1) must be Tensor, not list – Omar Abdelaziz Mar 08 '19 at 19:32
  • invalid argument 0: Sizes of tensors must match except in dimension 0. Got 38 and 36 in dimension 1 – Omar Abdelaziz Mar 08 '19 at 19:37
  • yay... it worked thank u so much so but could u explain what happens that x.long() because I got a tensor of zeros tensor([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]) – Omar Abdelaziz Mar 08 '19 at 19:42
  • you have got all zeros? well, that's bad. can you make sure that your initial tensors are not all zeros? you can easily check that via (my_tensor == 0).all() – Separius Mar 08 '19 at 19:46
  • damn! :D @OmarAbdelaziz you should use .float() not long(). this will solve your problem, you can also chage the resulting tensor back to half, if you want – Separius Mar 08 '19 at 19:47
  • you can do all of these in one line torch.cat([x.float() for x in my_list], dim=1).half() – Separius Mar 08 '19 at 19:48
  • @OmarAbdelaziz did it work? if so can you mark my answer as correct? – Separius Mar 08 '19 at 19:52
  • there u go here it works thank you very very much ,...... :D <3 I just have one question I know I ask a lot but this is the last one in torch.cat([x.float() for x in my_list], dim=1).half() what does .half() do? – Omar Abdelaziz Mar 08 '19 at 19:53
  • great, glad I helped. cat over a list of float tensors will produce a float tensor, but you were using half(16 bit floating point numbers) to present your data so you can convert this newly created tensor to half withour loosing data – Separius Mar 08 '19 at 19:57
  • thank a lot u can add the torch.cat([x.float() for x in my_list], dim=1).half() to the answer because it's a great answer – Omar Abdelaziz Mar 08 '19 at 19:59
  • done! + can you also add some more info to your question so others can benefit? you should mention in your question that you have tensors of different sizes in the second dimension and they are half not float – Separius Mar 08 '19 at 20:04
  • not sure if this is useful but here is a solution from the pytorch forum: https://discuss.pytorch.org/t/nested-list-of-variable-length-to-a-tensor/38699/21 – Charlie Parker Feb 09 '21 at 20:02
7

Tensor in pytorch isn't like List in python, which could hold variable length of objects.

In pytorch, you can transfer a fixed length array to Tensor:

>>> torch.Tensor([[1, 2], [3, 4]])
>>> tensor([[1., 2.],
            [3., 4.]])

Rather than:

>>> torch.Tensor([[1, 2], [3, 4, 5]])
>>> 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-809c707011cc> in <module>
----> 1 torch.Tensor([[1, 2], [3, 4, 5]])

ValueError: expected sequence of length 2 at dim 1 (got 3)

And it's same to torch.stack.

cloudyyyyy
  • 461
  • 4
  • 9
  • Hi cloudyy , thank u for ur answer it's helpful.... but is there any solution if I have different list of tensors with different lengths to stack then in one tensor – Omar Abdelaziz Mar 08 '19 at 19:08
  • not sure if this is useful but here is a solution from the pytorch forum: https://discuss.pytorch.org/t/nested-list-of-variable-length-to-a-tensor/38699/21 – Charlie Parker Feb 09 '21 at 20:02