6

My dask workers need to run init code that depends on the number of workers in the cluster. Can workers access such cluster metadata?

Randy Gelhausen
  • 125
  • 1
  • 5

1 Answers1

9

Clients can determine the number of workers in a cluster by using the Client.scheduler_info function.

>>> len(client.scheduler_info()['workers'])
8

Any function run within a worker can get a client using the get_client function.

>>> from dask.distributed import get_client
>>> n = len(get_client.scheduler_info()['workers'])

http://docs.dask.org/en/latest/futures.html#distributed.get_client

Although please be aware that this assumes that you're using the dask.distributed scheduler (and so can't use the basic single machine schedulers in the future) and in principle that the number of workers can change over time.

MRocklin
  • 55,641
  • 23
  • 163
  • 235