Embarrassingly parallel workloads with chunking using dask

Question

I am trying to use dask for an embarrassingly parallel workload.

Example would be, sum two large arrays element-by-element:

x1 = ...
x2 = ...

def func(a, b):
    sleep(0.001 + len(a) * 1e-5)
    return a + b + 1

I found [1] which seemingly answers this question, however, my function isn't that expensive, and can take an arbitrary amount of data to compute at once.

I would really like to chunk the data to avoid OOM, and dask.array seems to do that. I managed to get it "working" by wrapping func with da.as_gufunc(signature="(),()->()", output_dtypes=float, vectorize=True) - however that applies the function elementwise, and I need it to be on chunks.

I also tried dask.bag.from_sequence - but that converts the arrays x1 and x2 into lists, which causes OOM (or horrible performance).

The closest I got to what I want is this:

from dask.diagnostics import ProgressBar
import zarr
import dask.array as da


d1 = da.ones((1024, 1024, 1024, 128),chunks=64)
d2 = da.ones((1024, 1024, 1024, 128),chunks=64)

d3 = d1 + d2 + 1
                                                                                                                                                                                                                      
with ProgressBar():
    d3.to_zarr("three.zarr")

however I need to replace d1 + d2 + 1 with my function.

[1] https://examples.dask.org/applications/embarrassingly-parallel.html

is there something about this that actually requires the function to be called on chunks? note that `d3 = d1 + d2 + 1` in your second code block *does work on larger-than-memory arrays* by handling the chunking under the hood. You don't need to tell dask to work with chunks explicitly; once the data is chunked, the workers process each chunk individually - no need for map_blocks unless you need to run a function that must work on in-memory numpy data structures. — Michael Delgado, Nov 11 '21 at 17:07

score 0 · Answer 1 · answered Nov 11 '21 at 14:20

It seems I was nearly there:

da.map_blocks allows to apply functions in the intended way:

d3 = da.map_blocks(func, d1, d2)
                                                                                                                                                                                                                    
with ProgressBar():
    d3.to_zarr("three.zarr")

Embarrassingly parallel workloads with chunking using dask

1 Answers1