You may have read this already, but you could use the approach the the heapq
docs suggest, which is to just mark the element as removed, without actually removing it from the heap:
The remaining challenges revolve around finding a pending task and
making changes to its priority or removing it entirely. Finding a task
can be done with a dictionary pointing to an entry in the queue.
Removing the entry or changing its priority is more difficult because
it would break the heap structure invariants. So, a possible solution
is to mark the existing entry as removed and add a new entry with the
revised priority:
pq = [] # list of entries arranged in a heap
entry_finder = {} # mapping of tasks to entries
REMOVED = '<removed-task>' # placeholder for a removed task
counter = itertools.count() # unique sequence count
def add_task(task, priority=0):
'Add a new task or update the priority of an existing task'
if task in entry_finder:
remove_task(task)
count = next(counter)
entry = [priority, count, task]
entry_finder[task] = entry
heappush(pq, entry)
def remove_task(task):
'Mark an existing task as REMOVED. Raise KeyError if not found.'
entry = entry_finder.pop(task)
entry[-1] = REMOVED
def pop_task():
'Remove and return the lowest priority task. Raise KeyError if empty.'
while pq:
priority, count, task = heappop(pq)
if task is not REMOVED:
del entry_finder[task]
return task
raise KeyError('pop from an empty priority queue')
This way the removal is just a O(1)
lookup in a dict. And the removed item will just be ignored when it's popped from the queue later (at a cost of an extra O(log n)
on the pop_task
operation). The drawback of course, is that if a client is not actually popping items from the queue, the size of the heap will grow, even though it items are being "removed" according to the API.