Different behaviour of indexing and slicing in numpy structured arrays

Question

Suppose you have a structured array a:

import numpy as np

a = np.array([1, 2, 3, 4, 5, 6], dtype=[('val', 'i4')])
print(a)
[(1,) (2,) (3,) (4,) (5,) (6,)]

Now, if I would like to change one of the entries to a different value, the following two ways seem to be equivalent (Case I):

# both of these work
"""version a)"""
a['val'][1] = 10
print(a)
[( 1,) (10,) ( 3,) ( 4,) ( 5,) ( 6,)]

"""version b)"""
a[1]['val'] = 2
print(a)
[(1,) (2,) (3,) (4,) (5,) (6,)]

But this ambiguity (not sure if this is the appropriate term) breakes, if we try to change more than one entry (Case II):

"""version a)"""
a['val'][[0, 1]] = 15
print(a)
[(15,) (15,) ( 3,) ( 4,) ( 5,) ( 6,)]
# this works

"""version b)"""
a[[0, 1]]['val'] = 5
print(a)
[(15,) (15,) ( 3,) ( 4,) ( 5,) ( 6,)]
# this has no effect

I thought maybe in second case, version b), a new object is created so assigning a new value to those entries only affects the new object but not the original one. But also in the first case, version b), a new object seems to be created, as both of the following statements return False:

print(a[1]['val'] is a['val'][1])
print(a['val'][[0, 1]] is a[[0, 1]]['val'])

The fact that this ambiguity is only given in the first case, but not the second, seems inconsistent to me, if at least confusing. What am I missing?

All indexing produces a new array object, so `is` is not a good test. What matters is whether the first indexing produces a `copy` or `view`. `a[...][...] = value` is particularly sensitive to this distinction. — hpaulj, Apr 24 '20 at 19:09
Even with a simple dtype array, `arr[[0,1]][1:] = 1` will not modify `arr`. `arr[[0,1]]` is advanced indexing; `arr[0]` and `arr[3:]` are basic, regardless of dtype. — hpaulj, Apr 24 '20 at 19:45
Look at `a['val']`, `a[1]` and `a[[0,1]]` alone. Pay attention to shape and dtype. Field and record indexing are not interchangeable. And field indexing is not 'column' indexing. — hpaulj, Apr 24 '20 at 20:13
Thank you! So is there any way to check if two arrays originate from the same source array (and additionally hold the same values)? I am not sure if there is actually a use case for that, just curious. — mapf, Apr 27 '20 at 07:38

Ehsan · Accepted Answer · 2020-04-24T18:04:16.253

Great observation. Per numpy doc: For all cases of index arrays, what is returned is a copy of the original data, not a view as one gets for slices. While single element indexing returns a view.

Also note according to scipy doc, calling fields on structured arrays create a view AND also indexing with integer creates a structured scalar for which unlike other numpy scalars, structured scalars are mutable and act like views into the original array, such that modifying the scalar will modify the original array. Structured scalars also support access and assignment by field name

While it might not share the memory (I am not sure of internal implementation of it), it acts like a view and changes the original array. Therefore, when you call your array with single integer it acts like a view and changes the original array, whereas when you call it by array on integer indices, it creates a copy and does not change original array.

Thank you! This explains the behaviour, though I think it is still very unintuitive and prone to mistakes. — mapf, Apr 27 '20 at 07:31

Different behaviour of indexing and slicing in numpy structured arrays

1 Answers1