40

I'm having a problem with np.append.

I'm trying to duplicate the last column of 20x361 matrix n_list_converted by using the code below:

n_last = []
n_last = n_list_converted[:, -1]
n_lists = np.append(n_list_converted, n_last, axis=1)

But I get error:

ValueError: all the input arrays must have same number of dimensions

However, I've checked the matrix dimensions by doing

 print(n_last.shape, type(n_last), n_list_converted.shape, type(n_list_converted))

and I get

(20L,) (20L, 361L)

so the dimensions match? Where is the mistake?

odo22
  • 590
  • 1
  • 7
  • 16

5 Answers5

27

If I start with a 3x4 array, and concatenate a 3x1 array, with axis 1, I get a 3x5 array:

In [911]: x = np.arange(12).reshape(3,4)
In [912]: np.concatenate([x,x[:,-1:]], axis=1)
Out[912]: 
array([[ 0,  1,  2,  3,  3],
       [ 4,  5,  6,  7,  7],
       [ 8,  9, 10, 11, 11]])
In [913]: x.shape,x[:,-1:].shape
Out[913]: ((3, 4), (3, 1))

Note that both inputs to concatenate have 2 dimensions.

Omit the :, and x[:,-1] is (3,) shape - it is 1d, and hence the error:

In [914]: np.concatenate([x,x[:,-1]], axis=1)
...
ValueError: all the input arrays must have same number of dimensions

The code for np.append is (in this case where axis is specified)

return concatenate((arr, values), axis=axis)

So with a slight change of syntax append works. Instead of a list it takes 2 arguments. It imitates the list append is syntax, but should not be confused with that list method.

In [916]: np.append(x, x[:,-1:], axis=1)
Out[916]: 
array([[ 0,  1,  2,  3,  3],
       [ 4,  5,  6,  7,  7],
       [ 8,  9, 10, 11, 11]])

np.hstack first makes sure all inputs are atleast_1d, and then does concatenate:

return np.concatenate([np.atleast_1d(a) for a in arrs], 1)

So it requires the same x[:,-1:] input. Essentially the same action.

np.column_stack also does a concatenate on axis 1. But first it passes 1d inputs through

array(arr, copy=False, subok=True, ndmin=2).T

This is a general way of turning that (3,) array into a (3,1) array.

In [922]: np.array(x[:,-1], copy=False, subok=True, ndmin=2).T
Out[922]: 
array([[ 3],
       [ 7],
       [11]])
In [923]: np.column_stack([x,x[:,-1]])
Out[923]: 
array([[ 0,  1,  2,  3,  3],
       [ 4,  5,  6,  7,  7],
       [ 8,  9, 10, 11, 11]])

All these 'stacks' can be convenient, but in the long run, it's important to understand dimensions and the base np.concatenate. Also know how to look up the code for functions like this. I use the ipython ?? magic a lot.

And in time tests, the np.concatenate is noticeably faster - with a small array like this the extra layers of function calls makes a big time difference.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • This seems crazy to me. Why would numpy define arrays as [m,] and not [m,1] by default? – Sean Nov 10 '22 at 14:19
  • @Sean, in MATLAB where everything is 2d (or more) and Fortran/column major is the default, a column vector (m,1) shape may be most natural. But `numpy` is Python, written in C. and is used for more than linear algebra. `np.array([1,2,3])` is a lot simpler to type and display than `np.array([[1],[2],[3]])`. A 1d array maps naturally to/from a simple Python list of numbers. And with row-major ordering, the leading dimension is outer most, so a (m,) is a lot more like a (1,m) than a (m,1). But it's trivial to change between these 3 shapes. – hpaulj Nov 10 '22 at 23:47
13

(n,) and (n,1) are not the same shape. Try casting the vector to an array by using the [:, None] notation:

n_lists = np.append(n_list_converted, n_last[:, None], axis=1)

Alternatively, when extracting n_last you can use

n_last = n_list_converted[:, -1:]

to get a (20, 1) array.

Aguy
  • 7,851
  • 5
  • 31
  • 58
7

The reason why you get your error is because a "1 by n" matrix is different from an array of length n.

I recommend using hstack() and vstack() instead. Like this:

import numpy as np
a = np.arange(32).reshape(4,8) # 4 rows 8 columns matrix.
b = a[:,-1:]                    # last column of that matrix.

result = np.hstack((a,b))       # stack them horizontally like this:
#array([[ 0,  1,  2,  3,  4,  5,  6,  7,  7],
#       [ 8,  9, 10, 11, 12, 13, 14, 15, 15],
#       [16, 17, 18, 19, 20, 21, 22, 23, 23],
#       [24, 25, 26, 27, 28, 29, 30, 31, 31]])

Notice the repeated "7, 15, 23, 31" column. Also, notice that I used a[:,-1:] instead of a[:,-1]. My version generates a column:

array([[7],
       [15],
       [23],
       [31]])

Instead of a row array([7,15,23,31])


Edit: append() is much slower. Read this answer.

Community
  • 1
  • 1
RuRo
  • 311
  • 4
  • 17
  • `np.append` is slower than list `.append`; but comparable to the `stacks`. It uses `np.concatenate`. – hpaulj Aug 09 '16 at 12:59
  • @hpaulj So... As I was saying using `append` vs `stack` is the same with 2 matrices and `stack` is better for more than 2 elements, so `stack` is always _at least as good as_ `append`. – RuRo Aug 09 '16 at 15:15
5

You can also cast (n,) to (n,1) by enclosing within brackets [ ].

e.g. Instead of np.append(b,a,axis=0) use np.append(b,[a],axis=0)

a=[1,2]
b=[[5,6],[7,8]]
np.append(b,[a],axis=0)

returns

array([[5, 6],
       [7, 8],
       [1, 2]])
ZZZ
  • 704
  • 9
  • 18
0

I normally use np.row_stack((ndarray_1, ndarray_2, ..., ndarray_nth))

Assuming your ndarrays are indeed the same shape, this should work for you

n_last = []
n_last = n_list_converted[:, -1]
n_lists = np.row_stack((n_list_converted, n_last))