2

I am using concurrent.futures.ProcessPoolExecutor to run python codes in parallel. Basically what I do is

with concurrent.futures.ProcessPollExecutor(max_workers=10) as executor:
    futures = {executor.submit(my_function, i)
               for i in range(n)}
    
    for fut in concurrent.futures.as_completed(futures):
        print(fut.result())

This works fine with small number of n but for larger n it takes up a lot of RAM. I felt storing futures set (or list) is taking up the RAM. So I tried not to store the futures set and implemented what ever I wanted to do with the results in my_function itself. Like

with concurrent.futures.ProcessPollExecutor(max_workers=10) as executor:
    for i in range(n) :
        executor.submit(my_function, i)

But still It takes up a lot of RAM.

With some more digging, I found this. I understood that the first for loop submits all the tasks, but it takes time to execute them. So those tasks which are submitted but not executed will be stored in RAM.

Intuitively, I understood that one should not submit all the tasks at once, rather submit them gradually as the previous tasks are completed. I don't want to add any sleep/delay in the loop. Is there any better way to do that. I really did not understand is with map method instead of submit, what the chunksize argument does and how to decide what value to assign to it.

Is there any better or elegant way to do it? Or am I completely wrong? I used GNU parallel before, and it doesn't cause such large RAM problems. I want to have a python only solution.

1 Answers1

0

The solution is to use queues to limit the number of concurrent futures that are pending or use a periodic pooling for completed futures futures to send batches of futures instead of all of them at once.

This post is closely tied to your problem and may help you solving your issue.

Louis Lac
  • 5,298
  • 1
  • 21
  • 36