3

I have a list of numpy arrays, whose shape is one of the following: (10,4,4,20), (10,4,6,20). I want to convert the list to a numpy array. Since, they are of different shapes, I can't just stack them. So, I thought of creating numpy array considering each array as an object, as in here. I tried the below:

b = numpy.array(a)
b = numpy.array(a, dtype=object)

where a is the list of numpy arrays. Both are giving me the following error:

ValueError: could not broadcast input array from shape (10,4,4,20) into shape (10,4)

How can I convert that list to numpy array?

Example:

import numpy
a = [numpy.random.random((10,4,4,20)),
     numpy.random.random((10,4,6,20)),
     numpy.random.random((10,4,6,20)),
     numpy.random.random((10,4,4,20)),
     numpy.random.random((10,4,6,20)),
     numpy.random.random((10,4,6,20)),
     numpy.random.random((10,4,4,20)),
     numpy.random.random((10,4,4,20)),
     numpy.random.random((10,4,6,20))
    ]
b = numpy.array(a)

Use Case:
I know numpy array of objects are not efficient, but I'm not doing any operations on them. Usually, I have a list of same shape numpy arrays and so I can easily stack them. This array is passed to another function, which selects certain elements only. If my data is numpy array, I can just do b[[1,3,8]]. But I can't do the same with list. I get the following error if I try the same with list

c = a[[1,3,8]]
TypeError: list indices must be integers or slices, not list
faressalem
  • 574
  • 6
  • 20
Nagabhushan S N
  • 6,407
  • 8
  • 44
  • 87
  • If you keep your list the way it is, you can still access any element you want in the following way `a[0][1, 3, 3, 8]` where 0 is simply the first numpy array in the list. However, how do you differentiate arrays which third dimension is 6, meaning how do you know if you can actually access `a[6][1, 3, 5, 8]` for example? – Patol75 Apr 25 '20 at 04:50
  • I don't want to access `1,3,5,8`th element of first array in my list. I want to access `1,3,5,8`th numpy arrays in my list. Sorry, if my question was confusing since I used only 2 numpy arrays in my example. I have updated the question. Hope it's clear now – Nagabhushan S N Apr 25 '20 at 04:55
  • Thank you for clarifying. What about using `[a[x] for x in [1, 3, 5, 8]]`? What I do not understand is why you need the arrays to be stacked if you do not perform operations on them? – Patol75 Apr 25 '20 at 05:22
  • Yes. I can do that. Usually, I get same shape numpy arrays. So, my rest of the code works assuming it is a numpy array. This one is just a special case. I don't want to add extra handling code wherever this data is used. – Nagabhushan S N Apr 25 '20 at 05:26

1 Answers1

3

np.array(alist) will make an object dtype array if the list arrays differ in the first dimension. But in your case they differ in the 3rd, producing this error. In effect, it can't unambiguously determine where the containing dimension ends, and where the objects begin.

In [270]: alist = [np.ones((10,4,4,20),int), np.zeros((10,4,6,20),int)]                                
In [271]: arr = np.array(alist)                                                                        
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-271-3fd8e9bd05a9> in <module>
----> 1 arr = np.array(alist)

ValueError: could not broadcast input array from shape (10,4,4,20) into shape (10,4)

Instead we need to make an object array of the right size, and copy the list to it. Sometimes this copy still produces broadcasting errors, but here it seems to be ok:

In [272]: arr = np.empty(2, object)                                                                    
In [273]: arr                                                                                          
Out[273]: array([None, None], dtype=object)
In [274]: arr[:] = alist                                                                               
In [275]: arr                                                                                          
Out[275]: 
array([array([[[[1, 1, 1, ..., 1, 1, 1],
         [1, 1, 1, ..., 1, 1, 1],
         [1, 1, 1, ..., 1, 1, 1],
...
         [0, 0, 0, ..., 0, 0, 0],
         [0, 0, 0, ..., 0, 0, 0]]]])], dtype=object)
In [276]: arr[0].shape                                                                                 
Out[276]: (10, 4, 4, 20)
In [277]: arr[1].shape                                                                                 
Out[277]: (10, 4, 6, 20)
hpaulj
  • 221,503
  • 14
  • 230
  • 353