From this and this question, it seems that the Dask-ish way to program a stecil is using dask.array.map_blocks
with dask.array.ghost
or dask.array.map_overlap
. So I have the following code:
def local_stencil(block):
block[1:-1,1:-1,1:-1] = ( block[1:-1,1:-1,1:-1] + block[0:-2,1:-1,1:-1] + block[2:,1:-1,1:-1] + block[1:-1,0:-2,1:-1] + block[1:-1,2:,1:-1] + block[1:-1,1:-1,0:-2] + block[1:-1,1:-1,2:]) / 7.0;
return block
def stencil(grid, iterations, workers):
g = da.ghost.ghost(grid, depth={0:1, 1:1, 2:1}, boundary={0:0, 1:1, 2:0})
chunk_size = int(math.pow( g.shape[0]**3/workers, 1/3))
for iteration in range(iterations):
g = da.from_array( g.map_blocks(local_stencil).compute(), chunks=chunk_size)
return da.ghost.trim_internal(g, {0:1, 1:1, 2:1})
I don't know why when compared to a numpy version of the same stencil function (local_stencil
) it performs considerably worse:
Are there any changes I can make to my code to improve performance?
From this other answer, I understand how Dask is helpful when you have a file larger than RAM memory, but could Dask also be helpful in compute-bound operations like matrix multiplications or convolution operations?