How to vectorize getting sub arrays from numpy array using indexing arrays

Question

I want to get a numpy array of sub arrays from a base array using some type of indexing arrays (style/format of indexing arrays open for suggestions). I can easily do this with a for loop, but wondering if there is a clever way to use numpy broadcasting?

Constraints: Sub-arrays are guaranteed to be the same size.

up_idx = np.array([[0, 0],
                   [0, 2],
                   [1, 1]])
lw_idx = np.array([[2, 2],
                   [2, 4],
                   [3, 3]])
base = np.array([[1, 2, 3, 4],
                 [5, 6, 7, 8],
                 [9, 10, 11, 12]])

samples = []

for index in range(up_idx.shape[0]):
    up_row = up_idx[index, 0]
    up_col = up_idx[index, 1]
    lw_row = lw_idx[index, 0]
    lw_col = lw_idx[index, 1]

    samples.append(base[up_row:lw_row, up_col:lw_col])

samples = np.array(samples)

print(samples)
> [[[ 1  2]
    [ 5  6]]

   [[ 3  4]
    [ 7  8]]

  [[ 6  7]
   [10 11]]]

I've tried:

vector_s = base[up_idx[:, 0]:lw_idx[:, 1], up_idx[:, 1]:lw_idx[:, 1]]

But that was just nonsensical it seems.

oh sorry about that. i changed the variable before posting and missed on apparently. — dranobob, Apr 11 '17 at 19:45

jakevdp · Answer 1 · 2017-04-11T04:42:53.310

I don't think there is a fast way to do this in general via numpy broadcasting operations – for one thing, the way you set up the problem there is no guarantee that the resulting sub-arrays will be the same shape, and thus able to fit into a single output array.

The most succinct and efficient way to solve this is probably via a list comprehension; e.g.

result = np.array([base[i1:i2, j1:j2] for (i1, j1), (i2, j2) in zip(up_idx, lw_idx)])

Unless your base array is very large, this shouldn't be much of a bottleneck.

If you have different problem constraints (i.e. same size slice in every case) it may be possible to come up with a faster vectorized solution based on fancy indexing. For example, if every slice is of size two (as in your example above) then you can use fancy indexing like this to obtain the same result:

i, j = up_idx.T[:, :, None] + np.arange(2)
result = base[i[:, :, None], j[:, None]]

The key to understanding this fancy indexing is to realize that the result follows the broadcasted shape of the index arrays.

thank you for the reply. I should have mentioned the subarrays will be guaranteed the same size. Having them in a single numpy array is a requirement for the next section of code that I have already vectorized. — dranobob, Apr 11 '17 at 19:43

How to vectorize getting sub arrays from numpy array using indexing arrays

1 Answers1