4

I'm trying to run multiple identical short tasks with ProcessPoolExecutor on 256 cores Ubuntu Linux machine. The tasks are independent and don't share any resources.

I initialize ProcessPoolExector with 252 cores, and submit it a large number of tasks (100K to 1 million)

What i get is that the tasks start running, but after ~1000 tasks the main process gets stuck, and no more tasks are executed

This reproduces on different machines (with the same number of cores)

A sample program:

from concurrent.futures import ProcessPoolExecutor
from time import sleep

import multiprocessing as mp


def run_function(id):
    for i in range(int(1e5)):
        i += 1
    print(f"{id}  Done!")


if __name__ == "__main__":
    proc_count = 252

    mp_ctx = mp.get_context('spawn')

    executor = ProcessPoolExecutor(max_workers=proc_count, mp_context=mp_ctx)
    futures = []

    for i in range(int(1e6)):
        task_future = executor.submit(run_function, i)
        futures.append(task_future)

    while not all([f.done() for f in futures]):
        sleep(2)

Is it a bug in ProcessPoolExecutor? The documentation lacks a description of such limitation.

I have a workaround that includes my implementation of a process pool with raw python multiprocessing objects, but i prefer to understand ProcessPoolExecutor and use it properly

user107511
  • 772
  • 3
  • 23

0 Answers0