0

My code gets events notification and I used to process the events as soon as I get them. Earlier it was single threaded, but the events notification was quite fast to handle them. I changed my code to use multiprocessing using Pool. Here's what I did

  • created a Pool = multiprocessing.pool(processes=4) (I can process max 11 cores)
  • add events to the pool for async - pool.apply_async(go, ["event-1"])

that's all I did. In simple way I'm adding events to the pool and pool will be processed by 4 process's. Now my question is.

  • how can I test my events are processed by making use of all 4 process's? I start my scheduler every sunday, Monday is fine, on Tuesday's I still see Monday's events are processed, on Wednesday the number grows many Tuesday events are processed on Wednesday and so on...

I'm basically Java guy, I'm finding difficult to catch how python internally processing my events. I could simply increase the processes but I'm not sure if that helps?

My basic requirement is

  • I register myself for events, and would like to process every event
  • I would like to process event in separate process so main process/thread still continue listening for new events
  • I'm not worried the result of processed event. (but pool.apply_async(func1,["event1"]) returns values)

Please can you help me fill some thoughts?

Janne Karila
  • 24,266
  • 6
  • 53
  • 94

1 Answers1

2

Pool.apply places the event in the pool's queue and the first free process to grab it will execute go(event).

A simple way to figure out which process is doing what is to add some logging to your go functions.

import logging
import os

def go(event):
    logging.info("process: %d, event: %r", os.getpid, event)
    #do actual processing

How many processes you want in your pool depends on what sort of workload you have. If your jobs are CPU heavy, a worker pool larger than the number of cores won't help much. However if your bottleneck is IO you can probably benefit from more workers, and you should consider switching to threads (see multiprocessing.pool.ThreadPool).

radu.ciorba
  • 1,024
  • 1
  • 8
  • 14