9

I noticed some confusing behavior when indexing a flat numpy array with a list of tuples (using python 2.7.8 and numpy 1.9.1). My guess is that this is related to the maximum number of array dimensions (which I believe is 32), but I haven't been able to find the documentation.

>>> a = np.arange(100)
>>> tuple_index = [(i,) for i in a]
>>> a[tuple_index] # This works (but maybe it shouldn't)
>>> a[tuple_index[:32]] # This works too
>>> a[tuple_index[:31]] # This breaks for 2 <= i < 32
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: too many indices for array
>>> a[tuple_index[:1]] # This also works...

Is the list of tuples is being "flattened" if it is 32 elements or larger? Is this documented somewhere?

kadrlica
  • 322
  • 3
  • 15
  • 2
    Interesting, I get a different error message: `IndexError: unsupported iterator index`. Using python 2.7 and numpy 1.8.2 – swenzel Jun 02 '15 at 14:23
  • Sorry, I should have specified the versions (python 2.7.8; numpy 1.9.1). I've updated the question. – kadrlica Jun 03 '15 at 15:58

1 Answers1

6

The difference appears to be that the first examples trigger fancy indexing (which simply selects indices in a list from the same dimension) whereas tuple_index[:31] is instead treated as an indexing tuple (which implies selection from multiple axes).

As you noted, the maximum number of dimensions for a NumPy array is (usually) 32:

>>> np.MAXDIMS
32

According to the following comment in the mapping.c file (which contains the code to interpret the index passed by the user), any sequence of tuples shorter than 32 is flattened to an indexing tuple:

/*
 * Sequences < NPY_MAXDIMS with any slice objects
 * or newaxis, Ellipsis or other arrays or sequences
 * embedded, are considered equivalent to an indexing
 * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
 */

(I haven't yet found a reference for this in the official documentation on the SciPy site.)

This makes a[tuple_index[:3]] equivalent to a[(0,), (1,), (2,)], hence the "too many indices" error (because a has only one dimension but we're implying there are three).

On the other hand, a[tuple_index] is just the same as a[[(0,), (1,), (2,), ..., (99,)]] resulting in the 2D array.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • The [basics documentation](http://docs.scipy.org/doc/numpy/user/basics.indexing.html) claims that *Index arrays must be of integer type.* Apparently the implementation allows this anyway. What I still can't figure out is why you get a 2D array from the list of tuples greater than `np.MAXDIMS`. – Eric Appelt Jun 02 '15 at 14:48
  • I agree it doesn't seem obvious (and I don't know if it's intended or not). When longer than `MAXDIMS` [it looks like](https://github.com/numpy/numpy/blob/1f6e7cc470c6d5af23b2467863f42108e6c5f545/numpy/core/src/multiarray/mapping.c#l395) the list of tuples `[(0,), (1,), (2,) ..., (99,)]` is internally cast to a NumPy array (it will become a 2D array). This is then used to index the original 1D array (returning a 2D array). – Alex Riley Jun 02 '15 at 15:14