1

I have a 3-dim array, which I subset based on 2 of the 3 dimensions

import dask.array as da
import numpy as np

np.random.seed(40)
test_arr = np.random.normal(size=(2,3,4))

array([[[-0.6075477 , -0.12613641, -0.68460636,  0.92871475],
        [-1.84440103, -0.46700242,  2.29249034,  0.48881005],
        [ 0.71026699,  1.05553444,  0.0540731 ,  0.25795342]],

       [[ 0.58828165,  0.88524424, -1.01700702, -0.13369303],
        [-0.4381855 ,  0.49344349, -0.19900912, -1.27498361],
        [ 0.29349415,  0.10895031,  0.03172679,  1.27263986]]])

bool_check = test_arr[:,:,0] < 0.6

array([[ True,  True, False],
       [ True,  True,  True]])

# shape is (5,4)
arr1 = test_arr[bool_check]
# shape is (1,4)
arr2 = test_arr[~bool_check]

Note that I would rather have made test_arr a dask array from the start, but dask doesn't allow me to subset in this way like numpy does.

Now imagine in my actual use-case I do a bunch of manipulations that are irrelevant here and then want to reconstitute arr1 and arr1 into arr3 by subsetting. How would I do it?

arr3 = da.zeros_like(test_arr)

# this gives an error
arr3[da.from_array(bool_check)] = arr1
ValueError: Boolean index assignment in Dask expects equally shaped arrays.
matsuo_basho
  • 2,833
  • 8
  • 26
  • 47
  • So I did get this to work using all numpy arrays, but with Dask I get an error: ValueError: Boolean index assignment in Dask expects equally shaped arrays. "Alternatively, you can use the extended API that supports indexing with tuples" – matsuo_basho Jul 19 '23 at 20:38
  • Looks like this is a known dask limitation: https://stackoverflow.com/questions/72273565/how-to-apply-a-2d-boolean-array-on-a-3d-dask-array-in-python https://github.com/dask/dask/issues/7550 – matsuo_basho Jul 21 '23 at 17:31

0 Answers0