I'm using the Dask distributed scheduler, running a scheduler and 5 workers locally. I submit a list of delayed()
tasks to compute()
.
When the number of tasks is say 20 (a number >> than the number of workers) and each task takes say at least 15 secs, the scheduler starts rerunning some of the tasks (or executes them in parallel more than once).
This is a problem since the tasks modify a SQL db and if they run again they end up raising an Exception (due to DB uniqueness constraints). I'm not setting pure=True
anywhere (and I believe the default is False
). Other than that, the Dask graph is trivial (no dependencies between the tasks).
Still not sure if this is a feature or a bug in Dask. I have a gut feeling that this might be related to worker stealing...