0

My routine below takes a list of urllib2.Requests and spawns a new process per request and fires them off. The purpose is for asynchronous speed, so it's all fire-and-forget (no response needed). The issue is that the processes spawned in the code below never terminate. So after a few of these the box wilL OOM. Context: Django web app. Any help?

MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)



def request_manager(req_list):
    try:
            # put request list in the queue
            for req in req_list:
                    MPQ.put(req)

                    # call processes on queue
                    worker = multiprocessing.Process(target=process_request, args=(MPQ,))
                    worker.daemon = True
                    worker.start()

            # move on after queue is empty
            MPQ.join()

    except Exception, e:
            logging.error(traceback.print_exc())


# prcoess requests in queue
def process_request(MPQ):
    try:
            while True:
                    req = MPQ.get()
                    dr = urllib2.urlopen(req)
                    MPQ.task_done()

    except Exception, e:
            logging.error(traceback.print_exc())
  • while True: - there is termination? – eri Sep 05 '13 at 18:51
  • I've tried a few different approaches to this including terminate(), rescoping the global variables, sleeping and terminating, join() and no join(). The only thing that sort of worked thus far was doing time.sleep(1) and then worker.terminate() but that interrupted the process and sleeping through thousands of potential requests wouldn't work :( – Kruunch Arz Sep 05 '13 at 18:59
  • And in answer to your question: while True is satisfied when the Queue empties out (I know, it's not very intuitive). – Kruunch Arz Sep 05 '13 at 19:00
  • `except Queue.Empty,e: logging.info('task done'); except Exception, e: logging.error(traceback.print_exc())` – eri Sep 05 '13 at 19:07
  • Your solution is not good, i suggest use big shared Pool and map_async instead of queue. – eri Sep 05 '13 at 19:09
  • I was already thinking of using Pool, but now I'm a dog with a bone. I *need* to know why these processes orphan. Also, isn't the mechanism fairly the same for this particular example, whether I'm using Pool or Queue? – Kruunch Arz Sep 05 '13 at 19:11
  • example in answers, pool works 5x faster in my code then i needed to "fire and forget". – eri Sep 05 '13 at 19:19
  • Child process do not terminates then finished, it becomes zombie. Unix works in this way because process is cheap. – eri Sep 05 '13 at 19:21

4 Answers4

1

Maybe i am not right, but

MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)



def request_manager(req_list):
    try:
            # put request list in the queue
            pool=[]
            for req in req_list:
                    MPQ.put(req)

                    # call processes on queue
                    worker = multiprocessing.Process(target=process_request, args=(MPQ,))
                    worker.daemon = True
                    worker.start()
                    pool.append(worker)

            # move on after queue is empty
            MPQ.join()
            # Close not needed processes
            for p in pool: p.terminate()

    except Exception, e:
            logging.error(traceback.print_exc())


# prcoess requests in queue
def process_request(MPQ):
    try:
            while True:
                    req = MPQ.get()
                    dr = urllib2.urlopen(req)
                    MPQ.task_done()

    except Exception, e:
            logging.error(traceback.print_exc())
eri
  • 3,133
  • 1
  • 23
  • 35
  • That eliminated about 20% of the extra Processes but I think that was due to terminate hitting in a process before it was inactive in some cases. I got a bunch of Interrupted System call errors in the logs with this method, which is what I also saw with my attempted solution. – Kruunch Arz Sep 05 '13 at 19:49
0
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)
CHUNK_SIZE = 20 #number of requests sended to one process.
pool = multiprocessing.Pool(MP_CONCURRENT)

def request_manager(req_list):
    try:
            # put request list in the queue
            responce=pool.map(process_request,req_list,CHUNK_SIZE) # function exits after all requests called and pool work ended
    # OR
            responce=pool.map_async(process_request,req_list,CHUNK_SIZE) #function request_manager exits after all requests passed to pool

    except Exception, e:
            logging.error(traceback.print_exc())


# prcoess requests in queue
def process_request(req):
    dr = urllib2.urlopen(req)

This works ~5-10x faster then your code

eri
  • 3,133
  • 1
  • 23
  • 35
  • Thanks for the quick reply, however that did not work. It used twice as much memory and almost twice as many processes (267 vs. 150 my way running the same routine). Processes are still orphaned (complete with memory leak). – Kruunch Arz Sep 05 '13 at 19:39
  • pool = multiprocessing.Pool(MP_CONCURRENT) must be in module, not in function – eri Sep 05 '13 at 20:54
0

Integrate side "brocker" to django (such as rabbitmq or something like it).

eri
  • 3,133
  • 1
  • 23
  • 35
  • That's what I would have normally run (along with something like Celery) to offload the repetitive tasks except that in this case, the multiple requests are web api hits that circle around back to this web app (so I'd have to deal with the processing anyway). – Kruunch Arz Sep 06 '13 at 13:16
0

Ok after some fiddling (and a good night's sleep) I believe I've figured out the problem (and thank you Eri, you were the inspiration I needed). The main issue of the zombie processes was that I was not signaling back that the process was finished (and killing it) both of which I (naively) thought was happening automagically with multiprocess.

The code that worked:

# function that will be run through the pool
def process_request(req):
    try:
            dr = urllib2.urlopen(req, timeout=30)

    except Exception, e:
            logging.error(traceback.print_exc())

# process killer
def sig_end(r):
    sys.exit()

# globals
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
CHUNK_SIZE = 20
POOL = multiprocessing.Pool(MP_CONCURRENT)    

# pool initiator
def request_manager(req_list):
    try:
            resp = POOL.map_async(process_request, req_list, CHUNK_SIZE, callback=sig_end)

    except Exception, e:
            logging.error(traceback.print_exc())

A couple of notes:

1) The function that will be hit by "map_async" ("process_request" in this example) must be defined first (and before the global declarations).

2) There is probably a more graceful way to exit the process (suggestions welcome).

3) Using pool in this example really was best (thanks again Eri) due to the "callback" feature which allows me to throw a signal right away.