I'm trying to run multiple identical short tasks with ProcessPoolExecutor on 256 cores Ubuntu Linux machine. The tasks are independent and don't share any resources.
I initialize ProcessPoolExector with 252 cores, and submit it a large number of tasks (100K to 1 million)
What i get is that the tasks start running, but after ~1000 tasks the main process gets stuck, and no more tasks are executed
This reproduces on different machines (with the same number of cores)
A sample program:
from concurrent.futures import ProcessPoolExecutor
from time import sleep
import multiprocessing as mp
def run_function(id):
for i in range(int(1e5)):
i += 1
print(f"{id} Done!")
if __name__ == "__main__":
proc_count = 252
mp_ctx = mp.get_context('spawn')
executor = ProcessPoolExecutor(max_workers=proc_count, mp_context=mp_ctx)
futures = []
for i in range(int(1e6)):
task_future = executor.submit(run_function, i)
futures.append(task_future)
while not all([f.done() for f in futures]):
sleep(2)
Is it a bug in ProcessPoolExecutor? The documentation lacks a description of such limitation.
I have a workaround that includes my implementation of a process pool with raw python multiprocessing objects, but i prefer to understand ProcessPoolExecutor and use it properly