1

I have a function like this

def check_urls(res):
    pool = Pool(10)
    print pool.free_count()
    for row in res:
        pool.spawn(fetch, row[0], row[1])
    pool.join()

pool.free_count() outputs value 10.

I used pdb to trace. Program works fine until pool.spawn() loop.

But its waiting forever at pool.join() line.

Can someone tell me whats wrong?

PrivateUser
  • 4,474
  • 12
  • 61
  • 94

1 Answers1

1

But its waiting forever at pool.join() line.
Can someone tell me whats wrong?

Nothing!

Though, I first wrote what's below the line, the join() function in gevent is still behaving pretty much the same way as in subprocess/threading. It's blocking until all the greenlets are done.

If you want to only test whether all the greenlets in the pool are over or not, you might want to check for the ready() on each greenlet of the pool:

is_over = all(gl.ready() for gl in pool.greenlets)

Basically, .join() is not waiting forever, it's waiting until your threads are over. If one of your threads is never ending, then join() will block forever. So make sure every greenlet thread terminate, and join() will get back to execution once all the jobs are done.


edit: The following applies only to subprocess or threading modules standard API. The GEvent's greenlet pools is not matching the "standard" API.

The join() method on a Thread/Process has for purpose to make the main process/thread wait forever until the children processes/threads are over.

You can use the timeout parameter to make it get back to execution after some time, or you can use the is_alive() method to check if it's running or not without blocking.

In the context of a process/thread pool, the join() also needs to be triggered after a call to either close() or terminate(), so you may want to:

for row in res:
    pool.spawn(fetch, row[0], row[1])
pool.close()
pool.join()
zmo
  • 24,463
  • 4
  • 54
  • 90
  • I'm not seeing `is_alive` method here. https://github.com/surfly/gevent/blob/master/gevent/pool.py – PrivateUser May 19 '14 at 14:01
  • oh, I missed your `gevent` tag /o\ my bad, I'm talking about standard process or thread pools. – zmo May 19 '14 at 14:02
  • anyway, the `is_alive` method equivalent would be on each greenlet process, not on the pool itself. – zmo May 19 '14 at 14:04
  • Also I'm getting this error `AttributeError: 'Pool' object has no attribute 'close'` – PrivateUser May 19 '14 at 14:04
  • I still don't get your answer. Do I have to use `is_over` instead of `pool.join()`? – PrivateUser May 19 '14 at 14:19
  • it all depends on the context of your code, that I do not have. If you call `pool.join()` it **WILL** block per design until all your threads are over and execution is ready to get back into your main loop. If you want to have it not blocking, but only checking regularly whether your threads are over, you can use the `is_over` snippet I gave you. – zmo May 19 '14 at 14:21