How can I instruct dask to use a distributed Client
as the scheduler, externally from the code, e.g. via an environment variable?
The motivation is to take advantage of one of the key features of dask - namely the transparency of going from a single machine to a distributed cluster. However, there seems to be one little thing obscuring this transparency - the need to register a Client
via code.
I can set the named schedulers (e.g. "synchronous" and "processes") via the config (file/env var) as instructed here, but how do I use the same mechanism with a distributed one?
Ideally, I would like to set something like:
DASK_SCHEDULER=distributed(scheduler_file=...)
as an environment variable which would be equivalent of running client = Client(scheduler_file=...)
within python code.
This would then mean the EXACT same code can be run in different environments (local and distributed).