Questions tagged [dask-kubernetes]

Questions about using dask-kubernetes to create and run dask distributed clusters

Dask Kubernetes deploys Dask workers on Kubernetes clusters using native Kubernetes APIs. It is designed to dynamically launch short-lived deployments of workers during the lifetime of a Python process.

Full Documentation

54 questions
0
votes
1 answer

dask kubernetes aks (azure) virtual nodes

Using the code bellow it is possible to create a dask kubernetes cluster in azure aks. It uses a remote scheduler (dask.config.set({"kubernetes.scheduler-service-type": "LoadBalancer"})) and works perfectly. To use virtual nodes, uncomment the line…
Nuno Silva
  • 108
  • 10
0
votes
1 answer

What causes Dask futures to get stuck in 'pending' state?

I created my own very slightly modified Dockerfile based on the dask-docker Dockerfile that installs adlfs and copies one of my custom libraries into the container in order to make it available to all worker nodes. I deployed my container to my…
user655321
  • 1,572
  • 2
  • 16
  • 33
0
votes
1 answer

dask kubernetes import local library

When working on a local project, from local_project.funcs import local_func will fail in the cluster because local_project is not installed. This forces me to develop everything on the same file. Solutions? Is there a way to "import" the contents of…
Nuno Silva
  • 108
  • 10
0
votes
1 answer

How to Send .pem file to Dask Cluster?

I have a dask expression as follows where I'm trying to run a sqlalchemy query in a distributed way. However, it references a .pem key file that's inputted in the connect_args parameter. How do I upload this key file into the dask cluster/workers…
Riley Hun
  • 2,541
  • 5
  • 31
  • 77
0
votes
1 answer

How does Dask execute code on multiple vm's in the cloud

I wrote a program with dask and delayed and now I want to run it on several machines in the cloud. But there's one thing I don't understand - how does dask run the code on multiple machines in the cloud without having all the dependencies of the…
0
votes
1 answer

What are recommended dask-kubernetes configuration overrides for long-running tasks?

I am using something along the lines of the example provided in the docs import dask.bag from dask_kubernetes import KubeCluster cluster = KubeCluster.from_yaml('worker-spec.yml') cluster.adapt(minimum=0, maximum=24, interval="20000ms") dag =…
Pedro M Duarte
  • 26,823
  • 7
  • 44
  • 43
0
votes
1 answer

dask-kubernetes zero workers on GKE

Noob here. I want to have a Dask install with a worker pool that can grow and shrink based on current demands. I followed the instructions in zero to jupyterhub to install on GKE, and then went through the install instructions for dask-kubernetes:…
Patrick Mineault
  • 741
  • 5
  • 11
0
votes
1 answer

Deploying adaptive multi-user Dask cluster on Kubernetes

What is the proper way to deploy an adaptive multi-user Dask cluster on Kubernetes? I need a centralized cluster of machines multiple people can use for their work, so that it can add more machines or remove them (preferably, to 0 workers).
Philipp_Kats
  • 3,872
  • 3
  • 27
  • 44
0
votes
2 answers

Workers fails to deserialize with rasterio

after a deploy over Google Cloud the official Dask Helm chart I've update the environment with some extra conda packages, specifically xarray and rasterio. If I try to run my code I'm getting back this error from the workers log and the procedure…
Cursore
  • 33
  • 1
  • 5
1 2 3
4