0

I have a snippet of code that looks like this:

def slice_table(table, index_vector)
    to_index_product = []
    array_indices = []
    for i, index in enumerate(index_vector):
        if isinstance(index, list):
            to_index_product.append(index)
            array_indices.append(i)

    index_product = np.ix_(*to_index_product)
    for i, multiple in enumerate(index_product):
        index_vector[array_indices[i]] = multiple

    index_vector = tuple(index_vector)
    sliced_table = table[index_vector]
    return sliced_table

table is an np.ndarray of shape (6, 7, 2, 2, 2, 11, 9).

The purpose of the function is to pick out values that satisfy all the given indices. Since advanced NumPy indexing picks out separate value using one to one correspondence in the given index array instead of the desired intersections, I use np.nx_() to build matrices that would allow me to extract entire dimension values rather than just separate values. My initial test slice worked as desired, so I was content with the code:

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7], slice(0, 9, None)]
# The actual `index_vector` is code-generated, hence the usage of `slice()` object
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 3, 9)

In this example, every dimension except for the 2nd, 6th and 7th get an integer for an index and are thus absent from the slice. The shape of the slice is obvious from the vector because it has 2 integers as the second index, 3 integers as the 6th and a slice as the 7th index (meaning the entire length of the dimension is preserved). These examples also work:

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7, 8], 1]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4)

index_vector = [5, [1, 2], 1, 1, 1, [0, 3, 7, 8], [1, 3]]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4, 2)

However, for the code below, the shape is not what I expect it to be:

index_vector = [
    slice(0, 6, None),
    [1, 2],
    slice(0, 2, None),
    slice(0, 2, None),
    slice(0, 2, None),
    [0, 3, 7, 8],
    1,
]
sliced_table = slice_table(table, index_vector)
sliced_table.shape  # (2, 4, 6, 2, 2, 2)

The shape I want it to be is (6, 2, 2, 2, 2, 4), but for some reason there's a reshuffling taking place and the shape is all wrong. It's a bit hard to say whether the elements are wrong, too, because most of table is filled with None, but from the non-NoneType objects that I get, it feels that I get the desired elements (I don't see any undesired ones, that is), just reshaped for some reason.

Why does this happen? Maybe I don't correctly understand how np.ix_() works and I can't just build a product of array indices and extract the desired matrices for each dimension one by one, like I do in my function? Or is there something I don't get about NumPy indexing?

  • 1
    That's a case of mixed basic and advanced indexing. It's documented. The slice dimensions are moved to the end. I've tried to explain it in previous SO. – hpaulj Jun 14 '22 at 11:59
  • Oh, I must've missed that. Thank you! P.S.: It makes little sense to me why they would do this, though, seems to make everything harder to work with – Oleg Shevchenko Jun 14 '22 at 12:00
  • The details of why they have to do this are buried in compiled code. According to bug issue discussions a while back changing/correcting this difficult. The docs suggest splitting the indexing, such as `table[:,[0,2][...,[0,3,7,8],1]` – hpaulj Jun 14 '22 at 15:26
  • My previous answer on this topic: https://stackoverflow.com/questions/72408297/getting-unexpected-shape-while-slicing-a-numpy-array/72410804#72410804 – hpaulj Jun 14 '22 at 15:28
  • Well, the fix was easy, I simply encapsulated every integer index into a list and broadcast the entire `index_vector` – Oleg Shevchenko Jun 14 '22 at 19:39

1 Answers1

0

As @hpaulj mentioned, advanced indexing forms the first subset of dimensions, followed by basic indices. Since slice objects trigger basic indexing, their dimensions are appended to the subslice made by advanced indices. An exerpt from the docs:

The easiest way to understand a combination of multiple advanced indices may be to think in terms of the resulting shape. There are two parts to the indexing operation, the subspace defined by the basic indexing (excluding integers) and the subspace from the advanced indexing part. Two cases of index combination need to be distinguished:

The advanced indices are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].

The advanced indices are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced index in this regard.

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).