gevent.spawn()
creates greenlets not processes (even more: all greenlets run in a single OS thread). So multiprocessing.JoinableQueue
is not appropriate here.
gevent
is based on cooperative multitasking i.e, until you call a blocking function that switches to gevent
's event loop other greenlets won't run. For example conn
below uses patched for gevent socket methods that allow other greenlets to run while they wait for a reply from the site. And without pool.join()
that gives up control to the greenlet that runs the event loop no connections will be made.
To limit concurrency while making requests to several sites you could use gevent.pool.Pool
:
#!/usr/bin/env python
from gevent.pool import Pool
from gevent import monkey; monkey.patch_socket()
import httplib # now it can be used from multiple greenlets
import logging
info = logging.getLogger().info
def process(site):
"""Make HEAD request to the `site`."""
conn = httplib.HTTPConnection(site)
try:
conn.request("HEAD", "/")
res = conn.getresponse()
except IOError, e:
info("error %s reason: %s" % (site, e))
else:
info("%s %s %s" % (site, res.status, res.reason))
finally:
conn.close()
def main():
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(msg)s")
num_worker_threads = 2
pool = Pool(num_worker_threads)
sites = ["google.com", "bing.com", "duckduckgo.com", "stackoverflow.com"]*3
for site in sites:
pool.apply_async(process, args=(site,))
pool.join()
if __name__=="__main__":
main()