My dask workers need to run init
code that depends on the number of workers in the cluster. Can workers access such cluster metadata?
Asked
Active
Viewed 3,600 times
6

Randy Gelhausen
- 125
- 1
- 5
-
Can the people who voted to close provide a reason why it should be closed as a comment? – MRocklin Jan 03 '19 at 19:14
-
just write the worker and check it – Roman Pokrovskij Jan 03 '19 at 21:01
-
3Comment to those voting to close: this question was very specific: a dask worker within a dask cluster gettinng information about that cluster. – mdurant Jan 04 '19 at 14:05
1 Answers
9
Clients can determine the number of workers in a cluster by using the Client.scheduler_info
function.
>>> len(client.scheduler_info()['workers'])
8
Any function run within a worker can get a client using the get_client
function.
>>> from dask.distributed import get_client
>>> n = len(get_client.scheduler_info()['workers'])
http://docs.dask.org/en/latest/futures.html#distributed.get_client
Although please be aware that this assumes that you're using the dask.distributed scheduler (and so can't use the basic single machine schedulers in the future) and in principle that the number of workers can change over time.

MRocklin
- 55,641
- 23
- 163
- 235