Using numpy, what is a good way to index into a matrix using another matrix whose entry values are column numbers?

Question

Suppose I want to indepedently re-order each row of a matrix. Here is an example of that using np.argsort():

>>> A
array([[88, 44, 77, 33, 77],
       [33, 55, 66, 88,  0],
       [88,  0,  0, 55, 88],
       [ 0, 22, 44, 88, 33],
       [33, 33, 77, 66, 66]])

>>> ind = np.argsort(A); ind
array([[3, 1, 2, 4, 0],
       [4, 0, 1, 2, 3],
       [1, 2, 3, 0, 4],
       [0, 1, 4, 2, 3],
       [0, 1, 3, 4, 2]])

>>> np.array([A[i][ind[i]] for i in range(ind.shape[0])])
array([[33, 44, 77, 77, 88],
       [ 0, 33, 55, 66, 88],
       [ 0,  0, 55, 88, 88],
       [ 0, 22, 33, 44, 88],
       [33, 33, 66, 66, 77]])

The last expression above (the one that uses range()) is one solution to my problem. My question is: Is there a better way to do this?

The inputs are two matrices like A and ind, both 2-dimensional and of the same size. The matrix A can have any values, and the values of ind are interepreted to be column indexes within A. The output is a new matrix the same size as ind whose values are from A as per the above expression: np.array([A[i][ind[i]] for i in range(ind.shape[0])]).

Each row of ind corresponds to the same row in A. Entry B[i,j] of the output comes from entry A[i, ind[i, j]] of the input.

Note that ind may have fewer columns than A, and I would like to support that case.

I'm asking because my solution (the given expression) is essentially using a for loop, and maybe numpy can do this more quickly using some internal loop. For my application, speed is important, so it would be nice if I can be time-efficient.

Replace the iteration with a `np.arange(5)` row index. That may need some transposes. Or `np.take_along_axis(A, np.argsort(A),1)` - read its docs — hpaulj, Mar 03 '23 at 00:44

score 1 · Accepted Answer · answered Mar 03 '23 at 17:13

np.take_along_axiswas added (recently) to facilitate this kind of indexing. I believe its docs mention its use withargsort`:

In [113]: A=np.array([[88, 44, 77, 33, 77],
     ...:        [33, 55, 66, 88,  0],
     ...:        [88,  0,  0, 55, 88],
     ...:        [ 0, 22, 44, 88, 33],
     ...:        [33, 33, 77, 66, 66]])
     ...: 
     ...: ind=np.array([[3, 1, 2, 4, 0],
     ...:        [4, 0, 1, 2, 3],
     ...:        [1, 2, 3, 0, 4],
     ...:        [0, 1, 4, 2, 3],
     ...:        [0, 1, 3, 4, 2]])

In [114]: np.take_along_axis(A, np.argsort(A),1)
Out[114]: 
array([[33, 44, 77, 77, 88],
       [ 0, 33, 55, 66, 88],
       [ 0,  0, 55, 88, 88],
       [ 0, 22, 33, 44, 88],
       [33, 33, 66, 66, 77]])

I can do the same with advanced indexing, replacing the range iteration with np.arange.

In [119]: A[np.arange(5),ind.T].T
Out[119]: 
array([[33, 44, 77, 77, 88],
       [ 0, 33, 55, 66, 88],
       [ 0,  0, 55, 88, 88],
       [ 0, 22, 33, 44, 88],
       [33, 33, 66, 66, 77]]

I had to add a couple of transpose the get the desired output and layout. I haven't taken time to figure out just why those are needed.

Applying the argsort to axis 0 can be done with these two:

In [123]: A[ind,np.arange(5)]
Out[123]: 
array([[ 0, 55,  0, 66, 77],
       [33, 44, 66, 55, 33],
       [33,  0, 44, 33, 66],
       [88, 55, 77, 55, 33],
       [88, 55, 44, 66, 88]])

In [124]: np.take_along_axis(A, np.argsort(A),0)
Out[124]: 
array([[ 0, 55,  0, 66, 77],
       [33, 44, 66, 55, 33],
       [33,  0, 44, 33, 66],
       [88, 55, 77, 55, 33],
       [88, 55, 44, 66, 88]])

Using numpy, what is a good way to index into a matrix using another matrix whose entry values are column numbers?

1 Answers1