4

I was surprised by the result of the last expression?

>>> from numpy import array, arange
>>> a = arange(12).reshape(3,4)
>>> b1 = array([False,True,True])             # first dim selection
>>> b2 = array([True,False,True,False])       # second dim selection
>>>
>>> a[b1,:]                                   # selecting rows
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>>
>>> a[b1]                                     # same thing
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>>
>>> a[:,b2]                                   # selecting columns
array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])
>>>
>>> a[b1,b2]                                  # a weird thing to do
array([ 4, 10])

I expected:

array([[ 4,  6],
       [ 8, 10]])

Do you have any explanation why it is the case?

jpp
  • 159,742
  • 34
  • 281
  • 339
ady
  • 1,108
  • 13
  • 19
  • `a[b1, :][:, b2]` would produce your expected output. And also check out the indexing doc in [NumPy User Guide](https://docs.scipy.org/doc/numpy/user/basics.indexing.html#boolean-or-mask-index-arrays) and in [NumPy Reference](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#boolean-array-indexing). – YaOzI Jul 04 '18 at 16:00
  • yes, I tested it with: >>> a[:,b2][b1] array([[ 4, 6], [ 8, 10]]) and >>> a[b1][:,b2] array([[ 4, 6], [ 8, 10]]) – ady Jul 04 '18 at 18:56

1 Answers1

4

Let's start with your array:

a = np.array([[ 0,  1,  2,  3],
              [ 4,  5,  6,  7],
              [ 8,  9, 10, 11]])

Your current indexing logic equates to the following:

a[[1, 2], [0, 2]]  # array([ 4, 10])

Sticking to 2 dimensions, NumPy interprets this as indexing dim1-indices [1, 2] and dim2-indices [0, 2], or coordinates (1, 0) and (2, 2). There's no broadcasting involved here.

To permit broadcasting with Boolean arrays, you can use numpy.ix_:

res = a[np.ix_(b1, b2)]

print(res)

array([[ 4,  6],
       [ 8, 10]])

The magic ix_ performs is noted in the docs: "Boolean sequences will be interpreted as boolean masks for the corresponding dimension (equivalent to passing in np.nonzero(boolean_sequence))."

print(np.ix_(b1, b2))

(array([[1],
        [2]], dtype=int64), array([[0, 2]], dtype=int64))

As a side note, you can use a more direct approach if you have integer indices:

b1 = np.array([1, 2])
b2 = np.array([0, 2])

a[b1[:, None], b2]

See also: related question on why this method does not work with Boolean arrays.

jpp
  • 159,742
  • 34
  • 281
  • 339