Does multiprocessing.Queue work with gevent?

Question

Anyone know what is wrong with this code? It simply "loads" forever. No output. "Sites" is a list of a few dozen strings.

num_worker_threads = 30

def mwRegisterWorker():
    while True:
        try:
            print q.get()
        finally:
            pass

q = multiprocessing.JoinableQueue()
for i in range(num_worker_threads):
     gevent.spawn(mwRegisterWorker)

for site in sites:
    q.put(site)

q.join()  # block until all tasks are done

you might be better off using gevent by itself and running multiple instances of the python script under different ports. a simple stack would be: nginx (reverse proxy to multiple ports of running python instances) -> gevent wsgi python (multiples of the script running on different ports). also consider gunicorn — scape, Apr 23 '13 at 16:13

score 11 · Answer 1 · answered Sep 25 '11 at 19:17

gevent.spawn() creates greenlets not processes (even more: all greenlets run in a single OS thread). So multiprocessing.JoinableQueue is not appropriate here.

gevent is based on cooperative multitasking i.e, until you call a blocking function that switches to gevent's event loop other greenlets won't run. For example conn below uses patched for gevent socket methods that allow other greenlets to run while they wait for a reply from the site. And without pool.join() that gives up control to the greenlet that runs the event loop no connections will be made.

To limit concurrency while making requests to several sites you could use gevent.pool.Pool:

#!/usr/bin/env python
from gevent.pool import Pool
from gevent import monkey; monkey.patch_socket()
import httplib # now it can be used from multiple greenlets

import logging
info = logging.getLogger().info

def process(site):
    """Make HEAD request to the `site`."""
    conn = httplib.HTTPConnection(site)
    try:
        conn.request("HEAD", "/")
        res = conn.getresponse()
    except IOError, e:
        info("error %s reason: %s" % (site, e))
    else:
        info("%s %s %s" % (site, res.status, res.reason))
    finally:
        conn.close()

def main():
    logging.basicConfig(level=logging.INFO, format="%(asctime)s %(msg)s")

    num_worker_threads = 2
    pool = Pool(num_worker_threads)    
    sites = ["google.com", "bing.com", "duckduckgo.com", "stackoverflow.com"]*3
    for site in sites:
        pool.apply_async(process, args=(site,))
    pool.join()

if __name__=="__main__":
   main()

score 3 · Answer 2 · answered Sep 25 '11 at 20:39

3

Use gevent.queue.JoinableQueue instead. Green threads (gevent internally uses it) are neither threads nor process, but coroutine w/ user-level scheduling.

answered Sep 25 '11 at 20:39

minhee

5,688
5
43
81

Does multiprocessing.Queue work with gevent?

2 Answers2