0

Suppose a dask cluster has some CPU devices as well as some GPU devices. Each device runs a singe dask-worker. Now, the question is how do I find that the underlying device of a dask-worker is CPU or GPU.

For example:- if the dask-worker is running on CPU device, I should know that it's running on CPU or if the dask-worker is running on GPU device, I should know the device type programmatically. is there any method to know this programmatically.?

TheCodeCache
  • 820
  • 1
  • 7
  • 27
  • This might help https://stackoverflow.com/questions/49854695/can-we-create-a-dask-cluster-having-multiple-cpu-machines-as-well-as-multiple-gp – MRocklin Apr 19 '18 at 11:01

1 Answers1

0

The linked answer in the comment above is about marking different workers beforehand by resource, and then assigning tasks depending on what resources they may need.

Perhaps you, instead, were wanting to run your computation in a heterogeneous way, i.e., you don't mind which task gets to run on a GPU machine and which not, but in the case a GPU is available, you want to make use of it. This case is actually very simple from dask's point of view.

Consider where you might have a function that detects whether a GPU is present, and two functions which you can run to process your data, depends upon the case.

def process_data(d):
    if this_machine_has_gpu():
        return gpu_process(d)
    else:
        return cpu_process(d)

This structure is perfectly allowed to be used as a dask task, whether with the delayed mechanism or with arrays/dataframes.

mdurant
  • 27,272
  • 5
  • 45
  • 74