A code of mine does
from pathos.multiprocessing import ProcessingPool
def myFunc(something):
thispool = ProcessingPool(nodes=Result.cores)
listOfResults = thispool.map(something)
return listOfResults
for i in range(1000):
myFunc(i)
Now, in my actual more involved code, memory usage just kept growing. The code should take nothing, but if I run it with 12 cores, these 12 cores will initially take almost 1mb memory, but over the runtime of several hours, each of which will take several GB.
So, I thought that pool would leak memory, and that I better close it after each iteration:
def myFunc(something):
thispool = ProcessingPool(nodes=Result.cores)
listOfResults = thispool.map(something)
thispool.close()
thispool.join()
return listOfResults
However, now, after several iterations, I get
ValueError: Pool not running
at the this pool.map()
line. If I create a new
test = ProcessingPool(nodes=4)
and try to run test.map()
, I get the same error. Which is weird, I have initialized a new variable... does pathos.processing.ProcessingPool
have the feature of a unique process pool, and if I close one, I close all?
What's the correct way of implementing a pathos.multiprocessing.ProcessingPool
inside a loop, without memory leakage?
When I instead use multiprocessing.Pool
, the problem does not arise.