I am developing a distributed computing system using dask.distributed
. Tasks that I submit to it with the Executor.map
function sometimes fail, while others seeming identical, run successfully.
Does the framework provide any means to diagnose problems?
update By failing I mean increasing counter of failed tasks in the Bokeh web UI, provided by the scheduler. Counter of finished tasks increases too.
Function that is run by the Executor.map
returns None
. It communicates to a database, retrieves some rows from its table, performs calculations and updates values.
I've got more than 40000 tasks in map, so it is a bit tedious to study logs.