Here is a multiprocessing workflow with many consumers that execute tasks from a manager.queue. I use one extension Consumer
of the multiprocess.Process class that runs Task
which includes most of the runtime, and Aggor
which runs all the time and whose job is to aggregate the output from each of the Task
calls. The problem is that the managed Queue (which I am using because multiprocessing.Queue()
seemed to lose results frequently) seems to hang, showing that the result
queue is empty until the first Consumer exits. and then suddenly all of those results get dumped onto the queue. This is not preferred behavior because x
and agg
are very large, so having many of these stored in parallel is intractable. Preferably Aggor
will aggregate them into the final output as quickly as they are added to the queue.
I would like to better understand why the queue appears to be preventing "get" until each Consumer
process ends, and how to work around this to achieve the desired smaller memory profile.
Class Consumer(mp.Process):
def __init__(self, task_queue, result_queue, x, y):
mp.Process.__init__(self)
self.task_queue = task_queue
self.result_queue = result_queue
self.x = x
self.y = y
def run(self):
proc_name = self.name
while True:
next_task = self.task_queue.get()
if next_task is None:
self.task_queue.task_done()
break
(answer, ind) = next_task(self.x, self.y)
self.task_queue.task_done()
self.result_queue.put(answer)
return
and
Class Aggor(mp.Process):
def __init__(self, result_queue, final_queue, agg):
mp.Process.__init__(self)
self.result_queue = result_queue
self.final_queue = final_queue
self.agg = agg
def run(self):
proc_name = self.name
while True:
if not self.result_queue.empty():
answer = self.result_queue.get()
if answer is None:
break
else:
self.agg = welford(self.agg, answer)
else:
time.sleep(1)
continue
self.final_queue.put(self.agg)
return
and a task
class Task (object):
def __init__(self, rf, ind):
self.rf = rf
self.ind = ind
def __call__(self, x, y):
self.rf.fit(x, y)
m = Importance(self.rf, x) # the very time-consuming not multithreaded step.
return (m, self.ind)
def __str__(self):
return(F"job {self.ind}")
manager = mp.Manager()
tasks = mp.JoinableQue()
results = manager.Queue()
final = manager.Queue()
agg = (np.zeros(x.shape), np.zeros(x.shape))
ag = Aggor(results, final, ()
ag.start()
consumers = [ Consumer(tasks, results, x, y) for i in range(num_consumers)]
for w in consumers:
w.start()
for ii in range(n_times):
tasks.put(Task(rf, ii))
for i in range(num_consumers):
tasks.put(None)
tasks.join()
# this code hangs here for a long time and Aggor does not run until the first task exits.
results.put(None)
ag.join()
final_result = final.get()
I was expecting that the result
queue would receive output as quickly as the consumer
processes produced them and the Aggor
process would be able to run for the most part at the same time as the consumers. Instead the Consumer
processes run all the way to completion and exit before the Aggor
process is able to get any results from the result queue.
Other variations I have tried include: multiprocessing.Queue
instead of the manager.Queue
and doing this with standalone functions called with mp.Process
instead of Process objects. Pool
and Map
would be ideal, except that they want to keep a monolithic stack of my giant output matrices and there is no Reduce
operation into which I could call my aggregator as a lambda. I am open to small tweaks and also to someone telling me these are simply the wrong tools for the job and to commit to a different one that is more functional.