19

I want to be able to join() the Queue class but timeouting after some time if the call hasn't returned yet. What is the best way to do it? Is it possible to do it by subclassing queue\using metaclass?

philipxy
  • 14,867
  • 6
  • 39
  • 83
olamundo
  • 23,991
  • 34
  • 108
  • 149

4 Answers4

24

Subclassing Queue is probably the best way. Something like this should work (untested):

def join_with_timeout(self, timeout):
    self.all_tasks_done.acquire()
    try:
        endtime = time() + timeout
        while self.unfinished_tasks:
            remaining = endtime - time()
            if remaining <= 0.0:
                raise NotFinished
            self.all_tasks_done.wait(remaining)
    finally:
        self.all_tasks_done.release()
Lukáš Lalinský
  • 40,587
  • 6
  • 104
  • 126
  • 1
    Thanks! Where did you get info about all_task_done? I looked in http://docs.python.org/library/queue.html#module-Queue but I don't see any mention of that memeber... – olamundo Oct 14 '09 at 07:42
  • 4
    You can read the source code for Queue. It has a `timeout` parameter implemented for `put` and `get`, it was easy enough to extend `join` to use a similar approach. – Lukáš Lalinský Oct 14 '09 at 07:45
  • 1
    Any idea why `all_tasks_done` is not documented? This may mean that this method could be changed/broken in any release. – Chris W. Jan 02 '13 at 19:25
  • how is this implemented? do you call q.join_with_timeout instead of q.join()? – Source Matters Jun 10 '18 at 17:25
18

The join() method is all about waiting for all the tasks to be done. If you don't care whether the tasks have actually finished, you can periodically poll the unfinished task count:

stop = time() + timeout
while q.unfinished_tasks and time() < stop:
    sleep(1)

This loop will exist either when the tasks are done or when the timeout period has elapsed.

Raymond

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
0

At first, you should ensure that all your working threads in the queue exit with task_done()

To implement a timeout functionality with Queue, you can wrap the Queue's code in a Thread and add a timeout for this Thread using Thread.join([timeout])

untested example to outline what I suggest

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

def queuefunc():
    q = Queue()
    for i in range(num_worker_threads):
        t = Thread(target=worker)
        t.setDaemon(True)
        t.start()

    for item in source():
        q.put(item)

    q.join()       # block until all tasks are done

t = Thread(target=queuefunc)
t.start()
t.join(100) # timeout applies here
tuergeist
  • 9,171
  • 3
  • 37
  • 58
  • `t.join(100)` will be a timeout for the whole job. That wouldn't work for my use case, where I fill the queue over several hours and only call `q.join()` after I am done loading sources. Then I should have a much shorter timeout to catch the cases where, for whatever reason (including bugs), the workers fail to call `q.get()` enough times or `q.task_done()` equally many times. – Elias Hasle May 20 '21 at 09:33
0

As I tried to implement the accepted answer, it seems that all_tasks_done is not defined anymore. A quick solution is to use the timeout of the wait() function called in JoinableQueue.join.

For example overriding the join function in a subclass of JoinableQueue will add a 15s timeout on the waiting operation :

def join(self):
    with self._cond:
        if not self._unfinished_tasks._semlock._is_zero():
            self._cond.wait(15)
jpeg
  • 2,372
  • 4
  • 18
  • 31
peppie
  • 35
  • 7
  • how this is implemented ?? – free_123 Jul 20 '22 at 12:28
  • @free_123 which part ? the code in the answer should be put in a new class inheriting from JoinableQueue, to define a new type of Queue that you should use in your code. – peppie Jul 22 '22 at 07:57