1

I expected the code below to use 10 concurrent threads and finish the rest as the pool freed up. Instead, the extra calls result in crashed threads, as if the called function tried to proceed even as the urllib request did not succeed. But why? I thought the entire function won't be called until the pool is open for a greenlet. Or even if the function got called early, what stopped it from finishing?

FWIW: The workrow function (not shown) takes the list (a line of a CSV), makes an API request on the web, parses the JSON, and writes a line in another CSV.

import gevent
from gevent import socket
from gevent import monkey, pool
# patches stdlib (including socket and ssl modules) to cooperate with other greenlets
monkey.patch_all()
p = pool.Pool(10)

with open(inputfile, 'rb') as csvfile:
    entreader = unicodecsv.reader(csvfile, delimiter=',', quotechar='"')
    head=[entreader.next() for x in xrange(20)]
    for row in head:
        p.spawn(workrow, row)
p.join()

The error:

Traceback (most recent call last):
  File "/Users/laszlosandor/anaconda/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/Users/laszlosandor/Downloads/GoogleGeocoding/google_geocoding_LS.py", line 37, in workrow
    result = data_json["results"][0]
IndexError: list index out of range
<Greenlet at 0x101245c30: workrow([u'p_366937', u'/entity/p/366937.xml', u'H\xe9derv)> failed with IndexError
László
  • 3,914
  • 8
  • 34
  • 49
  • I'm not sure what the expected result is. The greenlets are exiting early because data_json["results"] is an empty sequence and throwing an index error – Ben Wilber Mar 29 '14 at 18:25
  • @BenWilber The expected result is that only 10 instances of workrow run at a time, with 10 different instances of row, while other rows are waiting for the turn, even though they were spawned. Isn't this what gevent Pool offers? – László Mar 29 '14 at 18:27
  • @BenWilber Oh, but I'm highly suspicious of the empty sequence only arising if I spawned more sequences than the pool. Basically the queries fail, then, but not if I stay within the size of the Pool. But that defeats the purpose of the Pool, no?! – László Mar 29 '14 at 18:28

0 Answers0