0

Basically I just want to know if I implemented threading correctly for concurrent socket threading. Here's my approach:

#!/usr/bin/env python
import sys
import time
from gevent import socket, Timeout, select
from gevent.pool import Pool

def worker(website):
    data = str()
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setblocking(0)
    s.connect(('google.com',80))
    s.send('HEAD / HTTP/1.1\n\n')

    while True:
        read, w, e = select.select([s], [], [])
        if read:
            data = s.recv(1024)
            if data:
                break

    print ('done')
    return 0

def main():

    pool = Pool(10)
    for item in items:
            pool.spawn(worker, item)
    pool.join()
Jason
  • 13
  • 4
  • 1
    Instead of checking `if read:`, I think it might be better (and more readable) to explicitly check for the socket in the read list: `if s in read:`. Also, for something simple like this, why not keep the socket blocking, and just call `read` on it? – Some programmer dude Jun 12 '12 at 05:26
  • Am I wrong, or does it seem excessive to use a select call on every single greenlet socket? Wouldn't you either create all your sockets and select on the entire set...or if using the pool approach...use the `gevent.socket.wait_read(fileno, ...)` on each individual socket? Or like @JoachimPileborg suggested, just set the socket to block, and directly call recv on each socket? – jdi Jun 12 '12 at 05:33
  • If you are only reading one socket, I don't see any reason to use `select`. Heck, even if you're reading from multiple sockets, there's *still* no reason to use select — just spawn multiple green threads to do the reading, then have them write to a queue, or use `Group.map` to stick them in a list, or whatever else makes sense. – David Wolever Jun 12 '12 at 06:31

1 Answers1

1

The threading portion (pool.spawn) is fine (although Group.map (or imap, or imap_unordered might be even prettier).

The select is entirely unnecessary, though. Since you're using gevent's patched socket, you can just use:

data = s.recv(1024)

A few other things:

  • You won't need the call to setblocking if you're doing it that way.
  • To be entirely correct, you should use socket.sendall.
  • With gevent, you will almost never select. If you need to read from ten sockets, just spawn ten green threads. For example: results = Group().map(lambda s: s.read(), my_sockets).
  • While we're here: it's very strange to use data = str()data = "" would be much more standard.
David Wolever
  • 148,955
  • 89
  • 346
  • 502