I have a function which I would like to be executed several times in parallel, but with only a defined number of instances at the same time.
The natural way to do this seems to be to use multiprocessing.Pool
. Specifically, the documentation says that
A frequent pattern (...) is to allow a worker within a pool to complete only a set amount of work before being exiting, being cleaned up and a new process spawned to replace the old one. The
maxtasksperchild
argument to the Pool exposes this ability to the end user.
maxtasksperchild
is defined as:
maxtasksperchild
is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. The default maxtasksperchild is None, which means worker processes will live as long as the pool.
I am not clear what task means here. If I want to have, say, only up to 4 instances of my worker running in parallel should I initiate multiprocessing.Pool
as
pool = multiprocessing.Pool(processes=4, maxtasksperchild=4)
How processes
and maxtasksperchild
work together? Could I set processes
to 10 and still have only 4 workers running (effectively having 6 processes idle?)