5

I need to build a circular buffer as a deque in python with efficient search (not O(n) el in deque, but O(1) like in set())

from collections import deque 
deque = deque(maxlen=10) # in my case maxlen=1000
for i in range(20):
    deque.append(i)
deque 
Out[1]: deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
10 in deque # but it takes O(n), I need O(1)
Out[1]: True

I guess I need to maintain a separate dictionary for lookup and remove from it once deque is full, but don't understand how. I don't need to remove from the middle of deque, just to append as deque did it and quick lookup.

dandan
  • 163
  • 7
  • Please update your question with an example of a lookup, even if is an inefficient one. – quamrana May 06 '19 at 11:58
  • Just thinking aloud: There are two problems with a separate `dict`: 1. How to make the `dict` matches the contents of the `deque` if you were to append lots of stuff. 2. How to store stuff in a `dict` or `set` if you have duplicates in the deque. – quamrana May 06 '19 at 12:03
  • 1. don't know 2. I will not add to `deque` if an incoming item is in `dict`, that's how I will keep track `deque` with unique elements – dandan May 06 '19 at 12:07
  • Well, essentially you want the constructor of a `deque` (ie with `maxlen`) plus `append()`, plus `in` (ie `__contains__()`, so that seems straighforward. Perhaps you could update your question with a class of your own that has a `deque` as a member, plus the methods I have outlined. – quamrana May 06 '19 at 12:11

1 Answers1

4

As you said, I guess you have to create a data structure with deque to insert/remove and set to look up O(1), like this:

from collections import deque

class CircularBuffer:
    def __init__(self, capacity):
        self.queue = deque()
        self.capacity = capacity
        self.value_set = set()

    def add(self, value):
        if self.contains(value):
            return
        if len(self.queue) >= self.capacity:
            self.value_set.remove(self.queue.popleft())
        self.queue.append(value)
        self.value_set.add(value)

    def contains(self, value):
        return value in self.value_set

test & output

cb = CircularBuffer(10)

for i in range(20):
    cb.add(i)

print(cb.queue)
print(cb.contains(10))

# deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
# True

It is a similar idea to implement a simple LRU Cache, dict + double linked list.
Hope that helps you, and comment if you have further questions. : )

recnac
  • 3,744
  • 6
  • 24
  • 46
  • Ok, I was hoping that the OP would give this a go first. But good answer. You have provided straightforward code, plus example usage and output! +1 – quamrana May 06 '19 at 12:13
  • it is an industrial process in OS, right? I can only reach that, so far. : ) @quamrana – recnac May 06 '19 at 12:18
  • 1
    Just to note: You haven't done anything about the requirement for no duplicates in the `deque`. – quamrana May 06 '19 at 12:22
  • fix it. thanks for mentioning it, I didn't read the comment before. @quamrana – recnac May 06 '19 at 12:24
  • @recnac thanks, do you think we should init deque like this? `self.queue = deque(max_len=capacity)` – dandan May 06 '19 at 12:27
  • 1
    I have considered that, but you have to get the element which pop up, and remove from set, so max_len is not necessary here. @dandan – recnac May 06 '19 at 12:29
  • @recnac In case I need to add multiple values at once, is it ok to do the following `for value in values: cb.add(value)`? Or it will be better to create a separate method that will use `self.queue.extend(values)` instead of for loop? – dandan May 06 '19 at 12:43
  • yes, actually I have implemented extend method in my first version @dandan – recnac May 06 '19 at 12:45