1

In this post is discussed that already started tasks can not be canceled in Dask (language limitation).

But what if I just want to omit those tasks?

start_computing_time = time.time()

for future in task_pool:
    if condition:
       do_something_long(future.result()) 
    else:       
       future.cancel()

total_computing_time = time.time() - start_computing_time

In my application, execution time is critical. Once the stopping condition is met, I just want to omit running tasks, as I am no longer interested on those results. As to my knowledge, future.cancel() will just cancel not yet running futures.

But for those tasks in execution, is there any way to ignore them?

Thank you in advance!

benhid
  • 124
  • 1
  • 17

1 Answers1

1

It sounds like the as_completed iterator might solve your problem. You can wait on a set of futures and update your system as they arrive. Then you have enough information (or a timeout has passed) you would just move on and delete the running futures.

MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • I forgot to mention that I am using the `as_completed` iterator (`task_poll = as_completed(futures)`); but as I understand it, running futures can not be canceled (and by the time the line `future.cancel()` is executed, the future is already computed) – benhid May 27 '19 at 07:46
  • Correct. You can't stop running threads in Python. However not all futures are actively running, some may be waiting for an open slot, or waiting on dependencies. – MRocklin May 28 '19 at 13:29