8

Suppose we have a deque with maxlen=3. If the deque already have 3 items and when I append a new item, how can I get the item that's going to be discarded?

The reason is that I want to maintain a window in memory that only contains the last N item, and when the window is full and one item is to be discarded, I need to get that item and do some extra work.

Here's my current solution:

from collection import deque

MAX_LEN=10

q = deque(maxlen=MAX_LEN)

while True:
    if len(q) == MAX_LEN:
        discarded = q.popleft()
        process(discarded)
    q.append(some_var)

Is this the best I can get? I've thought of using list and slice the list to limit size/get discarded item, but the if is unavoidable. Using deque at least I can get O(1) performance in the push and popleft operation.

Any ideas?

yegle
  • 5,795
  • 6
  • 39
  • 61
  • 1
    Why not `process(q[0]); q.append(some_var)`? – Bakuriu Feb 12 '14 at 19:09
  • 1
    @Bakuriu I need to process the one that's to be discarded. So the queue starts with 0 item, and when it reaches 3 item, I need to start processing `q[0]`. I just want to avoid the `if` comparison in the loop. – yegle Feb 12 '14 at 19:11
  • There's nothing wrong with your code, nor do I think you can do it any better. Although you *can* get rid of the `discarded` variable. And you *could* create a new subclass to do this for you – loopbackbee Feb 12 '14 at 19:13
  • 2
    What are you hoping to achieve here? There's no way the length test has a significant impact on performance, so what do you have to gain by trying to optimize it away? Besides, one way or another, you *must* check if you're at the maximum length, so the most I think you can do is `process(q.popleft())` without the interim variable binding to `discarded`. – Henry Keiter Feb 12 '14 at 19:14
  • The `deque` is the right data structure (in terms of efficiency), but it doesn't provide a "built-in" way of achieving what you want (which basically is a callback to be used in conjunction with `maxlen`...). The only way to avoid the explicit `if` is to hide it into some code that could be a subclass or (IMHO better) a simple wrapper of the `deque`. – Bakuriu Feb 12 '14 at 19:16
  • I think one would like leave the discarding to the deque, that's its job, but have a way of retrieving discarded elements anyway. This code, while functional, feels more like a hack on deque than a real solution. – njzk2 Feb 12 '14 at 19:17
  • @njzk2 You are right. In fact the above code would work as well without using `maxlen` at all. – Bakuriu Feb 12 '14 at 19:18
  • @Bakuriu : exactly. all that testing and poping items is really what a deque does. It feels frustrating to re-write it. – njzk2 Feb 12 '14 at 19:19
  • @AbhishekBansal Can you show an example of a simpler queue? AFAIK `deque` *is* the simplest and fastest queue the stdlib provides. Note that the `queue` module provides *synchronized* queues, which are only useful when using multi-threading. – Bakuriu Feb 12 '14 at 19:20
  • @Bakuriu yes, actually you are right. – Abhishek Bansal Feb 12 '14 at 19:23
  • Is your maxlen really only 3? If it is... just make your own. It'll probably be faster too. Or was that just "for example"? – Travis Griggs Feb 12 '14 at 19:25
  • If you can fit the generation of new values into a generator. You could create a dequee or tee based generator holding a 3 wide sized window of the iteration. Then you could read the first element of each yield, instead of pop and append each iteration. – M4rtini Feb 12 '14 at 19:30

2 Answers2

0

Something like this might work for you.

def pairwise(iterable, n=2 ):
    from itertools import tee, islice, izip
    return izip(*(islice(it,pos,None) for pos,it in enumerate(tee(iterable, n))))


for values in pairwise(infinite_iterable_generating_your_values, n=3):
    process(values[0])
    if breakconditions:
        break 

example of the pairwise function:

print [i for i in pairwise(range(10), n=3)]

[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9)]
M4rtini
  • 13,186
  • 4
  • 35
  • 42
0

The intended way to get the item about to be discarded is with indexed lookup: [0]:

>>> from collections import deque

>>> d = deque('abc', maxlen=3)
>>> about_to_be_discarded = d[0]
>>> about_to_be_discarded
'a'
>>> d.append('d')
>>> d
deque(['b', 'c', 'd'], maxlen=3)

The secondary problem is avoiding an if to determine whether the deque is full. One approach is to have two separate loops, one that initially fills the window and another that handles discards as new values are loaded:

d = deque(maxlen=n)
for i in range(n):
    d.append(get_input_value())
while True:
    discard = d[0]
    d.append(get_input_value())
    process(discard)
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485