Until now, I've used dask with get
and a dictionary to define the dependencies graph of my tasks. But it means that I have to define all my graph since the beginning, and now I want to add from time to time new tasks (with dependencies on old tasks).
I've read about the distributed
package, and it looks appropriate. I've seen two possible options to define my graph:
Using
delayed
, and define the dependencies between each task:t1 = delayed(f)() t2 = delayed(g1)(t1) t3 = delayed(g2)(t1) dask.compute([t2, t3])
Using
map
/submit
, and do something like:t1 = client.submit(f) t2 = client.map(g1, [t1])[0] t3 = client.map(g2, [t1])[0]
What do you think is more appropriate? Thanks!