I have come across an issue where dask scheduler get killed(though workers keep running) with memory error if large number of tasks are submitted in short period of time.
If it's possible to get current number of task on the cluster, then it's easy to control count of concurrent tasks submitted to the cluster.
NOTE: Tasks are being submitted to same scheduler from multiple clients.