4

I have a Dask cluster on AKS and I want to run a function f in parallel, but have this function run in a single process allocated in a single pod. According to the documentation on Worker Resources I should start each worker with dask-worker scheduler:8786 --nthreads 6 --resources "process=1". I need it because f is using multithreading internally.

# Example adapting up to 10 pods

from dask.distributed import Client
from dask_kubernetes import KubeCluster

cluster = KubeCluster(pod_template="pod_template.yml", deploy_mode="remote")
cluster.adapt(minimum=0, maximum=10)
client = Client(cluster)

# for this example suppose f has no arguments
futures = [client.submit(f, resources={"process": 1}) for _ in range(5)]  # 5 execution of f (could be map but this is an example)
results = [ft.result() for ft in futures]

When I execute the code above, 5 worker pods are raised, but the executions of f are carried out in only one of these 5 and sequentially.

If instead of adapt method I set manually cluster.scale(5), the executions of f most of the time run as I wish. I say most of the time because sometimes the behavior is similar to that of the adapt method. This behavior seems very strange to me.

Here is my pod_template.yaml file:

apiVersion: v1
kind: Pod
spec:
  restartPolicy: Never
  containers:
  - image: MyCustomDockerImage
    imagePullPolicy: IfNotPresent
    args: [dask-worker, --nthreads, '6', --no-dashboard, --memory-limit, 8GB, --death-timeout, '60', --resources, 'process=1']
    name: testdask
    resources:
      limits:
        cpu: "8"
        memory: 8G
      requests:
        cpu: "8"
        memory: 8G
  imagePullSecrets: 
    - name: acr-secret

  tolerations:
    - key: workloadpool
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"
  nodeSelector:
    nodepool: workloadpool
aaron
  • 39,695
  • 6
  • 46
  • 102
Andrex
  • 602
  • 1
  • 7
  • 22
  • would you consider changing the title of the post to something along the lines of "How do Dask worker resources work"? – scj13 Mar 03 '22 at 19:48

2 Answers2

0

adapt() does not guarantee that the cluster will spawn workers to match the number of tasks, nor that tasks will be assigned to the newly spawned worker.

You can lower the target_duration (default "5s") used to calculate the desired number of workers based on the current workload:

cluster.adapt(minimum=0, maximum=10, target_duration="1ns")

I found that "1ns" to "500ms" works for tasks that take at least 2 seconds.
One factor that may affect this is how long the cluster takes to spawn a worker.

But it is not guaranteed that tasks will be assigned to the newly spawned worker.
To achieve round-robin for client.submit(), see my original answer below.


(Original answer #2, before the question was edited to include this)

Pass resources={'process': 1} like the example in the documentation:

futures = [client.submit(f, resources={'process': 1}) for _ in range(5)]

(Original answer for reference)

The built-in scheduler does not guarantee round-robin for client.submit().
However, you can specify a different worker each time in workers argument.

Scale the cluster if needed:

n = 5
if len(client.cluster.workers) < n:
    client.cluster.scale(n)
    client.wait_for_workers(n)

Call client.scatter() — documented to be round-robin if you don't pass broadcast=True — on dummy data, get the workers who have that data, and then specify those workers:

workers = client.who_has(client.scatter([None]*n, hash=False)).values()
futures = [client.submit(f, pure=False, workers=worker) for worker in workers]
aaron
  • 39,695
  • 6
  • 46
  • 102
  • @Andrex Have you tried my original answer? – aaron Mar 04 '22 at 14:03
  • Using `client.wait_for_workers(n)` and then getting workers from `worker_hosts = list(client.scheduler_info()["workers"].keys())` I think could be less tricky than scattering dummy data. The problem I have is not with `scale` method, the `adapt` is the problem – Andrex Mar 04 '22 at 14:05
  • @Andrex Try the newly added solution. How long do your tasks take? – aaron Mar 06 '22 at 14:26
  • @Andrex Have you tried the `target_duration="1ns"` solution? – aaron Mar 22 '22 at 16:47
0

I think this example in the Dask docs is what you're looking for and requires two steps: 1) defining resources when you set up your cluster and 2) specifying the constraints per task when you submit tasks. Since you'd like to ensure each task is run inside a separate process, you can define the resources with dask-worker scheduler:8786 --nworkers 5 --nthreads 6 --resources "process=1" and then use futures = [client.submit(f, resources={'process': 1}) for _ in range(5)] to specify the constraints per task (as @aaron mentioned). This will ensure there will be at most 5 tasks running concurrently and each task will run in a separate process.

It's also worth noting here (as explained in here) these resources are abstract from Dask's perspective-- you could have chosen any term so long as it's consistent across workers and clients.

scj13
  • 306
  • 1
  • 5
  • executing `futures = [client.submit(f, resources={'process': 1}) for _ in range(5)]` does not work, it has the same behavior as the one I wrote in the post. I had done it according to the documentation, but I didn't write it in the post. i will edit it – Andrex Mar 04 '22 at 13:59