Questions tagged [dask-distributed]

Dask.distributed is a lightweight library for distributed computing in Python. It extends both the concurrent.futures and dask APIs to moderate sized clusters.

1090 questions
5
votes
1 answer

Why am I getting dask warnings when running a pandas operation?

I have a notebook with both pandas and dask operations. When I have not started the client, everything is as expected. But once I start the dask.distributed client, I get warnings in cells where I'm running pandas operations e.g.…
birdsarah
  • 1,165
  • 8
  • 20
5
votes
2 answers

Override dask scheduler to concurrently load data on multiple workers

I want to run graphs/futures on my distributed cluster which all have a 'load data' root task and then a bunch of training tasks that run on that data. A simplified version would look like this: from dask.distributed import Client client =…
user8871302
  • 123
  • 7
5
votes
2 answers

Progress reporting on dask's set_index

I am trying to wrap a progress indicator around the entire script. However, set_index(..., compute=False) does still run tasks on the scheduler, observable in the web interface. How do I report on the progress of the set_index step? import…
kadrach
  • 408
  • 6
  • 11
5
votes
1 answer

Safe & performant way to modify dask dataframe

As a part of data workflow I need to modify values in a subset of dask dataframe columns and pass the results for further computation. In particular, I'm interested in 2 cases: mapping columns and mapping partitions. What is the recommended safe &…
evilkonrex
  • 255
  • 2
  • 10
5
votes
1 answer

Dask Distributed Unable to locate credentials

I can't access my files on S3 using a dataframe read : df_read_csv. I get the error: Exception: Unable to locate credentials This works fine when my dask distributed is running against local worker cores. However, when I import a client with a…
4
votes
1 answer

Setting maximum number of workers in Dask map function

I have a Dask process that triggers 100 workers with a map function: worker_args = .... # array with 100 elements with worker parameters futures = client.map(function_in_worker, worker_args) worker_responses = client.gather(futures) I use docker…
ps0604
  • 1,227
  • 23
  • 133
  • 330
4
votes
1 answer

Dask multi-stage resource setup causes Failed to Serialize Error

Using the exact code from Dask's documentation at https://jobqueue.dask.org/en/latest/examples.html In case the page changes, this is the code: from dask_jobqueue import SLURMCluster from distributed import Client from dask import delayed cluster =…
michaelgbj
  • 290
  • 1
  • 10
4
votes
2 answers

Running two Tensorflow trainings in parallel using joblib and dask

I have the following code that runs two TensorFlow trainings in parallel using Dask workers implemented in Docker containers. I need to launch two processes, using the same dask client, where each will train their respective models with N…
ps0604
  • 1,227
  • 23
  • 133
  • 330
4
votes
1 answer

Dask: handling unresponsive workers

When using Dask with SGE or PBS clusters I sometimes have workers becoming unresponsive. These workers are highlighted in red in the dashboard Info section with their "Last seen" number constantly increasing. I know this can happen if submitted…
Thomas
  • 81
  • 7
4
votes
1 answer

Dask aws cluster error when initializing: User data is limited to 16384 bytes

I'm following the guide here: https://cloudprovider.dask.org/en/latest/packer.html#ec2cluster-with-rapids In particular I set up my instance with packer, and am now trying to run the final piece of code: cluster = EC2Cluster( …
ZirconCode
  • 805
  • 2
  • 10
  • 24
4
votes
1 answer

Dask crashing when saving to file?

I'm trying to take onehot encode a dataset then groupby a specific column so I can get one row for each item in that column with a aggregated view of what onehot columns are true for that specific row. It seems to be working on small data and using…
Lostsoul
  • 25,013
  • 48
  • 144
  • 239
4
votes
1 answer

Is there a way of using dask jobqueue over ssh

Dask jobqueue seems to be a very nice solution for distributing jobs to PBS/Slurm managed clusters. However, if I'm understanding its use correctly, you must create instance of "PBSCluster/SLURMCluster" on head/login node. Then you can on the same…
4
votes
1 answer

Avoiding memory overflow while using xarray dask apply_ufunc

I need to apply a function along the time dimension of an xarray dask array of this shape: dask.array
4
votes
1 answer

Timeout OSError while running dask on local cluster

I am trying to run the following code on a Power PC with config: Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo) CPE OS Name: cpe:/o:redhat:enterprise_linux:7.6:GA:server Kernel: Linux 3.10.0-957.21.3.el7.ppc64le …
Coddy
  • 549
  • 4
  • 18
4
votes
1 answer

dask.distributed SLURM cluster Nanny Timeout

I am trying to use the dask.distributed.SLURMCluster to submit batch jobs to a SLURM job scheduler on a supercomputing cluster. The jobs all submit as expect, but throw an error after 1 minute of running: asyncio.exceptions.TimeoutError: Nanny…
Ovec8hkin
  • 65
  • 1
  • 6