3

When running a Dask worker I notice that there are a few extra threads beyond what I was expecting. How many threads should I expect to see running from a Dask Worker and what are they doing?

MRocklin
  • 55,641
  • 23
  • 163
  • 235

1 Answers1

5

Dask workers have the following threads:

  • A pool of threads in which to run tasks. This is typically somewhere between 1 and the number of logical cores on the computer
  • One administrative thread to manage the event loop, communication over (non-blocking) sockets, responding to fast queries, the allocation of tasks onto worker threads, etc..
  • A couple of threads that are used for optional compression and (de)serialization of messages during communication
  • One thread to monitor and profile the two items above

Additionally, by default there is an additional Nanny process that watches the worker. This process has a couple of its own threads for administration.

These are internal details as of October 2018 and may change without notice.

People who run into "too many threads" issues often are running tasks that are themselves multi-threaded, and so get an N-squared threading issue. Often the solution here is to use environment variables like OMP_NUM_THREADS=1 but this depends on the exact libraries that you're using.

MRocklin
  • 55,641
  • 23
  • 163
  • 235