1

I have a numpy array and a list that list that defines the rows I want to select. What is the best way to do this operation?

import numpy as np

a = np.array([[1,2,3],
              [4,5,6],
              [7,8,9]])

b = np.array([[1],
              [0],
              [2]])

Desired result

np.array([[2],
         [4],
         [9]])

I have tried np.take() but this does not work.

Kind regards

EDIT: as this needs to be done repeatedly on a large array, I'm looking for a vectorized approach (without loops)

user44136
  • 25
  • 4
  • What is the mathematical logic to achieve this? – Mayank Porwal Apr 15 '20 at 12:53
  • For each row in `a` you select the element at the column defined by `b` – user44136 Apr 15 '20 at 12:56
  • You thus want the 1st element of the first row of `a`, the 0th element of the second row and the 2nd element of the last row (as defined in `b`) – user44136 Apr 15 '20 at 12:57
  • 1
    Does this answer your question? [NumPy selecting specific column index per row by using a list of indexes](https://stackoverflow.com/questions/23435782/numpy-selecting-specific-column-index-per-row-by-using-a-list-of-indexes) – Nuageux Apr 15 '20 at 13:25

4 Answers4

2

If you remove the extraneous dimensions from b

b = np.sqeeze(b)

You can use the following:

a[np.arange(len(b)), b]
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
1

It's not very pythonic but this should do the trick for your problem:

res = np.zeros(len(b))
for i, row in enumerate(a):
    res[i] = row[b[i]]

print(res)

same in one line:

a[[i[0] for i in b],[i for i in range(len(b))]]
sltzgs
  • 166
  • 6
1

Recent versions have added a take_along_axis which does what you want:

In [96]: a = np.array([[1,2,3], 
    ...:               [4,5,6], 
    ...:               [7,8,9]]) 
    ...:  
    ...: b = np.array([[1], 
    ...:               [0], 
    ...:               [2]])                                                                           
In [97]: np.take_along_axis(a,b,axis=1)                                                                
Out[97]: 
array([[2],
       [4],
       [9]])

It works much like @Nils answer, a[np.arange(3), np.squeeze(b)], but handles the dimensions better.

Similar recent questions:

efficient per column matrix indexing in numpy

keep elements of an np.ndarray by values of another np.array (vectorized)

hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

You could use a list comprehension:

np.array([a[i,b[i]] for i in range(len(b))]
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252