1

I have a problem with the multiprocessing in Python. I need to create async processes, which run a undefined time and the number of processes is also undefined. As soon as a new request arrives, a new process must be created with the specifications from the request. We use ZeroMQ for messaging. There is also a Process which is started at the beginning and only ends if the whole script terminates.

Now I am searching for a solution how I can await all processes, while being able to add additional processes.

asyncio.gather()

Was my first idea, but it needs the list of processes before it's been called.

class Object:
  def __init__(self, var):
     self.var = var

  async def run(self):
      *do async things*

class object_controller:
  
  def __init__(self):
     self.ctx = zmq.Context()
     self.socket = self.ctx.socket(zmq.PULL)
     self.socket.connect("tcp://127.0.0.1:5558")

     self.static_process = AStaticProcess()
     self.sp = aiomultiprocess.Process(target=self.static_process.run)
     self.sp.start()
     #here I need a good way to await this process


  def process(self, var):
    object = Object(var)
    process = aiomultiprocess.Process(target=object.run)
    process.start()
  
  def listener(self)
    while True:
      msg = self.socket.recv_pyobj()
      # here I need to find a way how I can start and await this process while beeing able to 
      # receive additional request, which result in additional processes which need to be awaited

This is some code which hopefully explains my problem. I need a kind of Collector which awaits the Processes.

After initialization, there is no interaction between the object and the controller, only over zeroMQ (between the static process and the variable processes). There is also no return.

2 Answers2

2

If you need to start up proceses while concurrently waiting for new ones, instead of explicitly calling await to know when the Processes finish, let them execute in the background using asyncio.create_task(). This will return a Task object, which has an add_done_callback method, which you can use to do some work when the process completes:

class Object:
  def __init__(self, var):
     self.var = var

  async def run(self):
      *do async things*

class object_controller:
  
  def __init__(self):
     self.ctx = zmq.Context()
     self.socket = self.ctx.socket(zmq.PULL)
     self.socket.connect("tcp://127.0.0.1:5558")

     self.static_process = AStaticProcess()
     self.sp = aiomultiprocess.Process(target=self.static_process.run)
     self.sp.start()
     asyncio.create_task(self.sp.join() self.handle_proc_finished)


  def process(self, var):
    object = Object(var)
    process = aiomultiprocess.Process(target=object.run)
    process.start()
  
  def listener(self)
    while True:
      msg = self.socket.recv_pyobj()
      process = aiomultiprocess.Process(...)
      process.start()
      t = asyncio.create_task(process.join())
      t.add_done_callback(self.handle_other_proc_finished)

  def handle_proc_finished(self, task):
     # do something

  def handle_other_proc_finished(self, task):
    # do something else

If you want to avoid using callbacks, you can also pass create_task a coroutine you define yourself, which waits for the process to finish and does whatever needs to be done afterward.

self.sp.start()
asyncio.create_task(wait_for_proc(self.sp))

async def wait_for_proc(proc):
   await proc.join()
   # do other stuff
dano
  • 91,354
  • 19
  • 222
  • 219
  • First thank you for your reply! This sound just like the solution I'm searching for but unfortunately I get an error ```RuntimeError: no running event loop sys:1: RuntimeWarning: coroutine 'Process.join' was never awaited``` I'm not sure what if missed. Any help is very welcome! – N. Icenstein Jul 13 '20 at 11:14
  • @N.Icenstein It's hard to say without a complete reproducer, but it's probably because you're creating a Task before you start up the event loop. You may need to refactor slightly so that the event loop starts up before you call `add_task` anywhere (maybe the `object_controller` constructor, in particular?) – dano Jul 13 '20 at 12:38
  • I'm now creating an eventloop in the constructor, and I am now using ```asyncio.ensure_future``` which works good so far (I'm currently working on some tests to ensure everything is really correct) – N. Icenstein Jul 13 '20 at 13:20
  • @N.Icenstein I would recommend starting up the event loop before you call the constructor - it just seems cleaner from a design perspective - but I'm glad that it solved the problem. – dano Jul 13 '20 at 13:25
-1

You need to create a list of tasks or a future object for the processes. Also you cannot add process to the event loop while awaiting other tasks

  • So theoretically I have to create a separate event loop for each new task? Is this even possible and practical? Would it be theoretically possible to call a method at the end of the while loop that checks all future objects and then continues? – N. Icenstein Jul 10 '20 at 13:48
  • That would defeat the purpose of async. Python is not actually multithreading.To achieve asychnronicity you must know beforehand which tasks you want to run asynchronous and gather those into a list and await asyncio.wait(task list). you could ,inside the while loop, create the task list and await them, but in the end you don't end up full async. You could try and use Asyncio queues. see: https://stackoverflow.com/questions/28115253/dynamically-add-to-list-of-what-python-asyncios-event-loop-should-execute – Ionut Dinca Jul 10 '20 at 14:44
  • *you must know beforehand which tasks you want to run asynchronous* - this is not true, you can add new tasks at any point during execution. `gather` and `wait` are just convenience functions to await a number of things to complete; one can easily (and often does) wait for an abstract end event and spawn tasks as necessary. – user4815162342 Jul 12 '20 at 06:54