2

I have a 2d numpy array, matrix, of shape (m, n). My actual use-case has m ~ 1e5 and n ~ 100, but for the sake of having a simple minimal example:

matrix = np.arange(5*3).reshape((5, 3))

I have an indexing array of integers, idx, of shape (m, ), with each entry between [0, n). This array specifies which column should be selected from each row of matrix.

idx = np.array([2, 0, 2, 1, 1])

So, I am trying to select column 2 from row 0, column 0 from row 1, column 2 from row 2, column 1 from row 1, and column 1 from row 4. Thus the final answer should be:

correct_result = np.array((2, 3, 8, 10, 13))

I have tried the following, which is intuitive, but incorrect:

incorrect_result = matrix[:, idx]

What the above syntax does is apply idx as a fancy indexing array, row by row, resulting in another matrix of shape (m, n), which is not what I want.

What is the correct syntax for fancy indexing of this type?

aph
  • 1,765
  • 2
  • 19
  • 34
  • 1
    Thanks @Divakar, this is indeed the same question. Not sure why it didn't come up in my so search before asking. – aph Aug 24 '16 at 19:16
  • 1
    Apparently the search engine on SO isn't that great. So, one thing I do is google search with `site:stackoverflow.com` added to the keywords. – Divakar Aug 24 '16 at 19:17
  • Thanks for the tip, that sounds useful! – aph Aug 24 '16 at 19:47

1 Answers1

5
correct_result = matrix[np.arange(m), idx]

The advanced indexing expression matrix[I, J] gives an output such that output[n] == matrix[I[n], J[n]].

If we want output[n] == matrix[n, idx[n]], then we need I[n] == n and J[n] == idx[n], so we need I to be np.arange(m) and J to be idx.

user2357112
  • 260,549
  • 28
  • 431
  • 505