In [55]: arr = np.array([
...: [1,2,3,4],
...: [5,6,7,8],
...: [9,10,11,12],
...: [13,14,15,16]
...: ])
...: m = [False,True,True,False]
In all your examples we can use this m1
instead of the boolean list:
In [58]: m1 = np.where(m)[0]
In [59]: m1
Out[59]: array([1, 2])
If m
was a 2d array like arr
than we could use it to select elements from arr
- but they will be raveled; but when used to select along one dimension, the equivalent array index is clearer. Yes we could use np.array([2,1])
or np.array([2,1,1,2])
to select rows in a different order or even multiple times. But substituting m1
for m
does not loose any information or control.
Select rows, or columns:
In [60]: arr[m1]
Out[60]:
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
In [61]: arr[:,m1]
Out[61]:
array([[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15]])
With 2 arrays, we get 2 elements, arr[1,1]
and arr[2,2]
.
In [62]: arr[m1, m1]
Out[62]: array([ 6, 11])
Note that in MATLAB we have to use sub2ind
to do the same thing. What's easy in numpy
is a bit harder in MATLAB; for blocks it's the other way.
To get a block, we have to create a column array to broadcast with the row one:
In [63]: arr[m1[:,None], m1]
Out[63]:
array([[ 6, 7],
[10, 11]])
If that's too hard to remember, np.ix_
can do it for us:
In [64]: np.ix_(m1,m1)
Out[64]:
(array([[1],
[2]]),
array([[1, 2]]))
[63] is doing the same thing as [62]; the difference is that the 2 arrays broadcast differently. It's the same broadcasting as done in these additions:
In [65]: m1+m1
Out[65]: array([2, 4])
In [66]: m1[:,None]+m1
Out[66]:
array([[2, 3],
[3, 4]])
This indexing behavior is perfectly consistent - provided we don't import expectations from other languages.
I used m1
because boolean arrays don't broadcast, as show below:
In [67]: np.array(m)
Out[67]: array([False, True, True, False])
In [68]: np.array(m)[:,None]
Out[68]:
array([[False],
[ True],
[ True],
[False]])
In [69]: arr[np.array(m)[:,None], np.array(m)]
...
IndexError: too many indices for array
in fact the 'column' boolean doesn't work either:
In [70]: arr[np.array(m)[:,None]]
...
IndexError: boolean index did not match indexed array along dimension 1; dimension is 4 but corresponding boolean dimension is 1
We can use logical_and
to broadcast a column boolean against a row boolean:
In [72]: mb = np.array(m)
In [73]: mb[:,None]&mb
Out[73]:
array([[False, False, False, False],
[False, True, True, False],
[False, True, True, False],
[False, False, False, False]])
In [74]: arr[_]
Out[74]: array([ 6, 7, 10, 11]) # 1d result
This is the case you quoted: "If obj.ndim == x.ndim, x[obj] returns a 1-dimensional array filled with the elements of x corresponding to the True values of obj"
Your other quote:
*"Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view)." *
means that if arr1 = arr[m,:]
, arr1
is a copy, and any modifications to arr1
will not affect arr
. However I could use arr[m,:]=10
to modify arr
. The alternative to a copy is a view
, as in basic indexing, arr2=arr[0::2,:]
. modifications to arr2
do modify arr
as well.