Lets say I have something like this:
def foo(a):
return a.sum()
x = np.random.rand(1000000,70)
X = dask.array.from_array(x)
X_list = [dask.delayed(foo)(X) for n in range(600)]
Xsums = dask.compute(*X_list)
This seem to get hung up in execution because of memory issues, which makes me suspect that when compute is run, dask creates many copies of the array X in memory. When there's only 2 elements in X_list, this executes fine.
Is there a way to make dask use pointers to X instead of creating 600 copies of X?