view of numpy with 2D slicing

Question

Numpy uses a view object to minimize memory copying. The code below slices the original ndarray using index list. The result is None, which means arr[ [0, 3] ] is allocated its own memory.

arr = np.arange(5)
print(arr[ [0, 3] ].base)

In case of range slicing, arr[0:2] returns a view pointing to original ndarray ([0 1 2 3 4])

print(arr[ 0:2 ].base)

In case of 2D slicing, it works differently than I expected None to be printed out.

arr = np.array([
    [0, 1, 2],
    [3, 4, 5],
    [6, 7, 8],
])

print(arr[  : , [0, 2] ].base )

[[0 3 6]
 [2 5 8]]

I wonder why None wasn't returned and why shape of base wasn't (3, 2).

hpaulj · Answer 1 · 2021-07-03T16:02:13.100

In [205]: arr = np.array([
     ...:     [0, 1, 2],
     ...:     [3, 4, 5],
     ...:     [6, 7, 8],
     ...: ])
     ...: 
     ...: arr[  : , [0, 2] ]
Out[205]: 
array([[0, 2],
       [3, 5],
       [6, 8]])

The array indexing does what we expect. It's just the base that's different.

In [206]: _.base
Out[206]: 
array([[0, 3, 6],
       [2, 5, 8]])

I think this base gives clues as to the underlying process of the indexing. It made a copy with the [0,2] advanced indexing, and performed some sort of transpose to return the desired array.

I haven't paid whole lot of attention to base. If there's doubt about whether something is a view or not, I like to compare the __array_interface__ .

Usually it's enough to know that the data value is clearly different (and not just an offset).

In [209]: arr.__array_interface__
Out[209]: 
{'data': (38026368, False),
 'strides': None,
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (3, 3),
 'version': 3}
In [210]: arr[  : , [0, 2] ].__array_interface__
Out[210]: 
{'data': (38386288, False),
 'strides': (8, 24),
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (3, 2),
 'version': 3}

Looking further at the strides:

In [218]: arr.strides
Out[218]: (24, 8)
In [219]: arr[:,[0,2]].strides
Out[219]: (8, 24)
In [220]: arr[:,[0,2]].copy().strides
Out[220]: (16, 8)

arr strides is (3*8, 8), since stepping to the next row means skipping 3 column values. But the indexed strides are reversed, which is what I'd expect from the transpose of its base. The full copy is (2*8,8) reflecting the 2 column shape.

So here base is revealing details of the indexing process. I haven't noticed those before, and I've been using numpy for quite some time.

Another case where base reflects the construction process, is making your arr via arange:

In [213]: np.arange(9).reshape(3,3).base
Out[213]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

That base comes from the arange, yet we never assigned to a variable.

view of numpy with 2D slicing

1 Answers1

Linked