Index numpy nd array along last dimension

Question

Is there an easy way to index a numpy multidimensional array along the last dimension, using an array of indices? For example, take an array a of shape (10, 10, 20). Let's assume I have an array of indices b, of shape (10, 10) so that the result would be c[i, j] = a[i, j, b[i, j]].

I've tried the following example:

a = np.ones((10, 10, 20))
b = np.tile(np.arange(10) + 10, (10, 1))
c = a[b]

However, this doesn't work because it then tries to index like a[b[i, j], b[i, j]], which is not the same as a[i, j, b[i, j]]. And so on. Is there an easy way to do this without resorting to a loop?

Just to make sure I understand properly, you want `c[i, j] = a[i, j, b[i, j]]` — mgilson, Dec 03 '14 at 17:00
This is normally done like `c = a[np.arange(b.shape[0]), np.arange(b.shape[1]), b]` but I'm hoping there's a better way. — , Dec 03 '14 at 20:31
@moarningsun That fails my correctness test. I think you must convert one of the two aranges to a column vector or so. — Bas Swinckels, Dec 03 '14 at 21:22

Bas Swinckels · Accepted Answer · 2014-12-04T08:28:30.357

6

There are several ways to do this. Let's first generate some test data:

In [1]: a = np.random.rand(10, 10, 20)

In [2]: b = np.random.randint(20, size=(10,10))  # random integers in range 0..19

One way to solve the question would be to create two index vectors, where one is a row vector and the other a column vector of 0..9 using meshgrid:

In [3]: i1, i0 = np.meshgrid(range(10), range(10), sparse=True)

In [4]: c = a[i0, i1, b]

This works because i0, i1 and b will all be broadcasted to 10x10 matrices. Quick test for correctness:

In [5]: all(c[i, j] == a[i, j, b[i, j]] for i in range(10) for j in range(10))
Out[5]: True

Another way would be to use choose and rollaxis:

# choose needs a sequence of length 20, so move last axis to front
In [22]: aa = np.rollaxis(a, -1)  

In [23]: c = np.choose(b, aa)

In [24]: all(c[i, j] == a[i, j, b[i, j]] for i in range(10) for j in range(10))
Out[24]: True

edited Dec 04 '14 at 08:28

answered Dec 03 '14 at 21:01

Bas Swinckels

18,095
3
45
62

1

Thank you. I like your first option better. The `choose` option looks nice, but it is not general enough as `choose` does not work for indices larger than 32 (a current bug in numpy, see [discussion on increasing `NPY_MAXARGS`](https://github.com/numpy/numpy/issues/4398)). According to my tests, it is also twice as slow as the `meshgrid` option. – tiago Dec 04 '14 at 09:45
1

In that case, you should probably look into functions like `mgrid` and `ogrid`, which are referenced from the `meshgrid` documentation. But do you work with matrices with more than 32 dimensions? I feel sorry for you, my head already explodes when trying to understand an array with more than 2 dimensions :). – Bas Swinckels Dec 04 '14 at 09:53
Ah, I see your error now when going from 10x10x20 to 10x10x40. I guess `choose` unpacks the 3D matrix into a list of 2D matrices along the first dimension. – Bas Swinckels Dec 04 '14 at 09:59
@BasSwinckels I tried this with unequal row and column sizes and it didn't work (`IndexError: shape mismatch: indexing arrays could not be broadcast together`). I'm not familiar with meshgrid so could you update it? – KobeJohn Sep 27 '15 at 03:12

Index numpy nd array along last dimension

1 Answers1

Linked