0

I want to create 2D numpy.array knowing at the begining only its shape, i.e shape=2. Now, I want to create in for loop ith one dimensional numpy.arrays, and add them to the main matrix of shape=2, so I'll get something like this:

matrix=
[numpy.array 1]
[numpy.array 2]
...
[numpy.array n]

How can I achieve that? I try to use:

matrix = np.empty(shape=2)
for i in np.arange(100):
    array = np.zeros(random_value)
    matrix = np.append(matrix, array)

But as a result of print(np.shape(matrix)), after loop, I get something like:

(some_number, )

How can I append each new array in the next row of the matrix? Thank you in advance.

bluevoxel
  • 4,978
  • 11
  • 45
  • 63
  • 1
    This is a really inefficient way to create an array. Every time you append to an array it generates a new copy. If you're doing the appending within a loop, this process becomes progressively slower and slower as the size of your array grows. It's much better to pre-allocate the output array then fill in the rows as go. If you don't know the exact size of the output then you could initially allocate an array larger than you think need and index only the rows you need once you're done. Another option would be to create a list of arrays, then call `np.array()` or `np.vstack()` once on the list. – ali_m May 08 '15 at 11:39
  • 1
    Also, it looks like you're trying to create a ragged array (i.e. one where the length of each row varies), whereas numpy is really only designed to handle rectangular arrays. For example, think about how indexing would work for a ragged array - there's no straightforward way to do slice indexing over the ragged dimension(s), e.g. `a[:, :10]`. Depending on what you want to use the output for, it might make more sense to use a list of arrays, or to create a rectangular array padded with NaNs where there are missing values. – ali_m May 08 '15 at 11:51

3 Answers3

2

I would suggest working with list

matrix = []

for i in range(10):
    a = np.ones(2)
    matrix.append(a)

matrix = np.array(matrix)

list does not have the downside of being copied in the memory everytime you use append. so you avoid the problem described by ali_m. at the end of your operation you just convert the list object into a numpy array.

Asking Questions
  • 637
  • 6
  • 18
1

I suspect the root of your problem is the meaning of 'shape' in np.empty(shape=2)

If I run a small version of your code

matrix = np.empty(shape=2)
for i in np.arange(3):
    array = np.zeros(3)
    matrix = np.append(matrix, array)

I get

array([  9.57895902e-259,   1.51798693e-314,   0.00000000e+000,
     0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
     0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
     0.00000000e+000,   0.00000000e+000])

See those 2 odd numbers at the start? Those are produced by np.empty(shape=2). That matrix starts as a (2,) shaped array, not an empty 2d array. append just adds sets of 3 zeros to that, resulting in a (11,) array.

Now if you started with a 2 array with the right number of columns, and did concatenate on the 1st dimension you would get a multirow array. (rows only have meaning in 2d or larger).

mat=np.zeros((1,3))
for i in range(1,3):
    mat = np.concatenate([mat, np.ones((1,3))*i],axis=0)

produces:

array([[ 0.,  0.,  0.],
       [ 1.,  1.,  1.],
       [ 2.,  2.,  2.]])

A better way of doing an iterative construction like this is with list append

alist = []
for i in range(0,3):
    alist.append(np.ones((1,3))*i)
mat=np.vstack(alist)

alist is:

[array([[ 0.,  0.,  0.]]), array([[ 1.,  1.,  1.]]), array([[ 2.,  2.,  2.]])]

mat is

array([[ 0.,  0.,  0.],
       [ 1.,  1.,  1.],
       [ 2.,  2.,  2.]])

With vstack you can get by with np.ones((3,), since it turns all of its inputs into 2d array.

append would work, but it also requires axis=0 parameter, and 2 arrays. It gets misused, often by mistaken analogy to the list append. It is just another front end to concatenate. So I prefer not to use it.

Notice that other posters assumed your random value changed during the iteration. That would produce a arrays of differing lengths. For 1d appending that would still produce the long 1d array. But a 2d append wouldn't work, because an 2d array can't be ragged.

mat = np.zeros((2,),int)
for i in range(4):
    mat=np.append(mat,np.ones((i,),int)*i)
# array([0, 0, 1, 2, 2, 3, 3, 3])
hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

The function you are looking for is np.vstack

Here is a modified version of your example

import numpy as np

matrix = np.empty(shape=2)

for i in np.arange(3):
    array = np.zeros(2)
    matrix = np.vstack((matrix, array))

The result is

array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
Alfred Rossi
  • 1,942
  • 15
  • 19
  • `np.vstack` function demand for the same dimensions of stacked arrays. – bluevoxel May 08 '15 at 11:25
  • If you really want to do that you can create a numpy array which stores list references but it's not really all that efficient (see here http://stackoverflow.com/questions/3386259/how-to-make-a-multidimension-numpy-array-with-a-varying-row-size). – Alfred Rossi May 08 '15 at 11:28
  • Why do you want the outer array to be a numpy array at all then? – Alfred Rossi May 08 '15 at 11:30
  • 1
    It would be much faster to create a list of arrays within the loop, then call `np.vstack` once on the list. This is because concatenating arrays generates a copy, whereas appending an array to a list does not. – ali_m May 08 '15 at 11:43
  • I absolutely agree. I would update the example to reflect this, however I have no idea what OP is trying to achieve as he says he wants to effectively vstack arrays of different length. – Alfred Rossi May 08 '15 at 11:45