I have a 3-dim array, which I subset based on 2 of the 3 dimensions
import dask.array as da
import numpy as np
np.random.seed(40)
test_arr = np.random.normal(size=(2,3,4))
array([[[-0.6075477 , -0.12613641, -0.68460636, 0.92871475],
[-1.84440103, -0.46700242, 2.29249034, 0.48881005],
[ 0.71026699, 1.05553444, 0.0540731 , 0.25795342]],
[[ 0.58828165, 0.88524424, -1.01700702, -0.13369303],
[-0.4381855 , 0.49344349, -0.19900912, -1.27498361],
[ 0.29349415, 0.10895031, 0.03172679, 1.27263986]]])
bool_check = test_arr[:,:,0] < 0.6
array([[ True, True, False],
[ True, True, True]])
# shape is (5,4)
arr1 = test_arr[bool_check]
# shape is (1,4)
arr2 = test_arr[~bool_check]
Note that I would rather have made test_arr
a dask array from the start, but dask doesn't allow me to subset in this way like numpy does.
Now imagine in my actual use-case I do a bunch of manipulations that are irrelevant here and then want to reconstitute arr1
and arr1
into arr3
by subsetting. How would I do it?
arr3 = da.zeros_like(test_arr)
# this gives an error
arr3[da.from_array(bool_check)] = arr1
ValueError: Boolean index assignment in Dask expects equally shaped arrays.