I have a use case where I have to process some documents and which takes some time. So I tried batching the documents and multiprocessing them, it worked good and completed in less time as expected. Also there are multiple stages of processing docs, I used multiprocessing at all the stages individually. When I fire multiple concurrent requests to do the processing, after serving some 70+ requests, I noticed that some of the processes are not killed.
I am performing load test with locust, where I create 5 users and with a wait time of 4 - 5 seconds, and each request approximately takes 3.5 secs, so I tried multiprocessing package and various others wrappers (pebble, parallel-execute, pathos, concurrent.futures).
What basically I do is,
from multiprocessing import Pool
with Pool(processes=5) as p:
out = p.starmap(do_something, args)
p.close()
p.terminate()
Also the official documentation says that the pool will be closed after the execution while doing like this with
. When I stop the request firing, the last one or two requests are stagnant. I found this by just by printing "Started {req_num}" and "Served {req_num}" before and after the process. Before adding p.close()
and p.terminate()
I could see more process being running after stopped triggering requests. After adding them only the last triggered process is not served. And now if I start triggering the requests and stop them after a while again the same last one or two requests are not served and their processes are stagnant. So the stagnant process accumulates.
And every wrapper, I mentioned had different way of closing the pool. I tried them too. like with pathos,
p = Pool(processes=5)
out = p.map(do_something, args)
p.join()
p.close()
p.terminate()
And with concurrent.future.ThreadPoolExecutor
it was p.shutdown()
. In every other wrapper I was facing the same issue. Here the number of stagnant processes were more than it was in multiprocessing.Pool
I need help in finding the reason or the right way to do this. Any help would be much appreciated!