0

I am new to DASK.

I can submit 10 tasks using the client.map(funct_name, iterator) where the iterator is a list which contain the 10 elements.

Now, I want to submit the next task let's say 11th task when anyone from earlier submitted 10 tasks is completed.

I know there is something called process pooling in python. But I want to implement something like process pooling using DASK.

Someone please guide me with DASK process pooling.

Mahendra Gaur
  • 380
  • 2
  • 11

2 Answers2

1

The easiest thing you can do is to use wait before submitting the new work

futs = client.map(funct_name, iterator)
distributed.wait(futs)
out = client.submit(eleventh, args)

If, however, you wanted to submit your new work while the previous ten were in flight, but have it automatically wait until all were done, you could contruct a fake task that apparently depends on the previous work, but doesn't actually make use of them

futs = client.map(funct_name, iterator)

def run_eleventh(args, deps):
    return eleventh(args)

out = client.submit(run_eleventh, (args, futs))
mdurant
  • 27,272
  • 5
  • 45
  • 74
0

You might want to look at the as_completed object here:

http://docs.dask.org/en/latest/futures.html#waiting-on-futures

from dask.distributed import as_completed

futures = client.map(score, x_values)

for future in as_completed(futures):
    ...
    client.submit(...)
MRocklin
  • 55,641
  • 23
  • 163
  • 235