7

The Python class has six requirements as listed below. Only bold terms are to be read as requirements.


  1. Close to O(1) performance for as many of the following four operations.
  2. Maintaining sorted order while inserting an object into the container.
  3. Ability to peek at last value (the largest value) contained in the object.
  4. Allowing for pops on both sides (getting the smallest or largest values).
  5. Capability of getting the total size or number of objects being stored.
  6. Being a ready made solution like the code in Python's standard library.

What follows is left here for historical reasons (help the curious and prove that research was conducted).


After looking through Python's Standard Library (specifically the section on Data Types), I still have not found a class that fulfills the requirements requirements of a fragmentation table. collections.deque is close to what is required, but it does not support keeping the data contained in it sorted. It provides:

  1. Efficient append and pops on either side of a deque with O(1) performance.
  2. Pops on both sides for the data contained within the object.
  3. Getting the total size or count of objects contained within.

Implementing an inefficient solution using lists would be trivial, but finding a class that performs well would be far more desirable. In a growing memory simulation with no upper limit, such a class could keep indexes of empty (deleted) cells and keep fragmentation levels down. The bisect module may help:

  1. Helps keep an array in sorted order while inserting new objects in array.
  2. Ready made solution for keeping lists sorted as objects are added.
  3. Would allow executing array[-1] to peek at last value in the array.

The final candidate that failed to fully satisfy the requirements and appeared least promising was the heapq module. While supporting what looked like efficient insertions and assuring that array[0] was the smallest value, the array is not always in a fully sorted state. Nothing else was found to be as helpful.


Does anyone know of a class or data structure in Python that comes close to these six requirements?

Noctis Skytower
  • 21,433
  • 16
  • 79
  • 117
  • 5
    Which 6 requirements? "Efficient append on *either side* " and "keep an array in sorted order while inserting" are contradictory. – kennytm Nov 04 '10 at 15:23
  • That is why I am looking for **something that comes close** to the requirements. – Noctis Skytower Nov 04 '10 at 15:24
  • 1
    I'm assuming the O(1) performance for the appends is only if it's the min/max? – user470379 Nov 04 '10 at 15:24
  • There is no requirement for appends, only sorted insertions. I'm thinking that a type of sorted binary tree is required for the solution. – Noctis Skytower Nov 04 '10 at 15:25
  • 1
    "1. Efficient append... on either side... with O(1) performance." is not a requirement? – user470379 Nov 04 '10 at 15:27
  • No, only the bold terms are requirements. I was merely stating what `collections.deque` is capable of doing. That is why **pops on both sides** is mentioned directly thereafter. – Noctis Skytower Nov 04 '10 at 15:30
  • 1
    The bold text `O(1) performance` is not a meaningful requirement. O(1) for what? Please add a clear, succinct list of requirements that we don't have to squint through unrelated text and guess. Google doesn't know what a "fragmentation table" is, and neither do I. – Glenn Maynard Nov 04 '10 at 15:34
  • You don't need to know what a fragmentation table is. However, I would use the data structure described above for my own purpose of a *memory fragmentation table*. What that means is specific to my purpose and was provided for the purpose of being sufficient for the curious. People do not need to be asking, "What are you trying to do?" – Noctis Skytower Nov 04 '10 at 15:38
  • 1
    I didn't ask what you're using it for and I don't care. Is that supposed to be a justification for refusing to clarify the list of requirements, or did you simply ignore the first 80% of what I said? – Glenn Maynard Nov 04 '10 at 15:42
  • Please forgive me for ignoring the first 80% of what you said. That was wrong of me. I thought that readers would easily associate `O(1) preformance` as being a requirement for as many operations as possible (specifically insertion, popping, peeking, and length lookup). I have become increasingly annoyed at people on SO that ask why someone wants to do something rather just answering the question (if possible without needing to know the reason). I was trying to be courteous to curious by stating what the structure is to be used for but did not want to offer any more explanation of the reasons. – Noctis Skytower Nov 04 '10 at 15:50
  • 1
    Very often "what are you trying to do" is a short, polite way of saying "Your question leads me to suspect you are going about your problem in the wrong way; please describe the original problem rather than the corner you've painted yourself into." – zwol Nov 04 '10 at 16:48

3 Answers3

12

Your requirements seem to be:

  1. O(1) pop from each end
  2. Efficient len
  3. Sorted order
  4. Peek at last value

for which you can use a deque with a custom insert method which rotates the deque, appends to one end, and unrotates.

>>> from collections import deque
>>> import bisect
>>> class FunkyDeque(deque):
...     def _insert(self, index, value):
...             self.rotate(-index)
...             self.appendleft(value)
...             self.rotate(index)
...
...     def insert(self, value):
...             self._insert(bisect.bisect_left(self, value), value)
...
...     def __init__(self, iterable):
...             super(FunkyDeque, self).__init__(sorted(iterable))
...
>>> foo = FunkyDeque([3,2,1])
>>> foo
deque([1, 2, 3])
>>> foo.insert(2.5)
>>> foo
deque([1, 2, 2.5, 3])

Notice that requirements 1, 2, and 4 all follow directly from the fact that the underlying data structure is a deque, and requirement 3 holds because of the way data is inserted. (Note of course that you could bypass the sorting requirement by calling e.g. _insert, but that's beside the point.)

Katriel
  • 120,462
  • 19
  • 136
  • 170
  • Well, he also requires O(1) (or close to O(1)) for in-order inserting. deque with all the rotating makes the operation O(N), N being the deque length. – tzot Dec 04 '10 at 23:50
9

Many thanks go out to katrielalex for providing the inspiration that led to the following Python class:

import collections
import bisect

class FastTable:

    def __init__(self):
        self.__deque = collections.deque()

    def __len__(self):
        return len(self.__deque)

    def head(self):
        return self.__deque.popleft()

    def tail(self):
        return self.__deque.pop()

    def peek(self):
        return self.__deque[-1]

    def insert(self, obj):
        index = bisect.bisect_left(self.__deque, obj)
        self.__deque.rotate(-index)
        self.__deque.appendleft(obj)
        self.__deque.rotate(index)
Noctis Skytower
  • 21,433
  • 16
  • 79
  • 117
  • 1
    If you could - could you have a look at the questions Q3, Q4 of my post [http://stackoverflow.com/questions/4295806/few-questions-on-generator-expressions-and-speed-efficient-alternatives] and tell me if using FastTable is an answer to Q3, Q4? – PoorLuzer Nov 28 '10 at 22:07
  • For peek(), why not use `queue[0]` or `queue[-1]` as suggested in the docs? Indexed access is O(1) at both ends (but slows to O(n) in the middle) - https://docs.python.org/2/library/collections.html#collections.deque. – mchen Jun 17 '15 at 23:12
  • @MiloChen That is a good question! Thanks for asking. Check the revised version to see if you like it any better. – Noctis Skytower Jun 18 '15 at 14:38
2

blist.sortedlist

  1. Close to O(1) performance for as many of the following four operations.
  2. Maintaining sorted order while inserting an object into the container.
  3. Ability to peek at last value (the largest value) contained in the object.
  4. Allowing for pops on both sides (getting the smallest or largest values).
  5. Capability of getting the total size or number of objects being stored.
  6. Being a ready made solution like the code in Python's standard library.

It's a B+ Tree.

tzot
  • 92,761
  • 29
  • 141
  • 204