I would like to direct all dask temporary data to my fast and big disk at /mnt/1
. I am running the scheduler like so:
dask-scheduler --local-directory /mnt/1
and the workers:
dask-worker 127.0.0.1:8786 --memory-limit 16GB --nthreads 1 --nprocs 6 --local-directory /mnt/1/
My imports look like this:
import dask
from dask import dataframe as dd
from dask import delayed
from dask.distributed import Client
client = Client('localhost:8786', set_as_default=True)
dask.config.set(shuffle='disk')
And yet, I am still seeing a partd
directory being created and filled with stuff in my /tmp
directory, which is not on my fast and big disk.
My question is: how do I tell dask distributed to send absolutely all temporary data to /mnt/1
and not put anything in /tmp
?