2

I have a working code (with the required efficiency) for a binary min heap priority queue that keeps index 0 empty and has the root node at index 1 (children are 2i and 2i+1). I insert the new value into the heap at the first available empty space on the end of the heap, and apply heapify-up (if needed) to move it up to its respective position. This is the technique that the textbook states to do.

However, in my assignment, the prof wants us to insert the new value into index 0 and then apply heapify-down to move it to its respective position but the root index is still supposed to be at index 1.

The heapify-down method's code will also be used when an item is deleted from index 1 (extractMin) and items must be shifted around to maintain the priority queue and binary min heap structure. In this case, there is no item at index 0...all values are at index >= 1.

Is it possible to maintain the efficiency of O(log n) for the heapify-down method to be used in these cases? Should I create 2 heapify down methods? Currently my method has about 30 lines of code and in trying to fix it to meet the assignment requirements my code is getting to be almost 100+ lines of code with tons of if/else statements.

Thanks for your help!!

Update: spoke with the prof and he told me that he just wanted us to use all indices in the heap so there's no wasted space. I should disregard his previous email and assignment details about inserting at index 0 + heapify-down, and instead just use the index 0 for the swap method. Currently I am using a temp variable so the swap is done in 3 steps but I am modifying this as per his instructions to utilize the space at arr[0] - it's now 4 steps but it meets his requirements :-)

  • 1
    I'm trying to think how this would go, and now my head hurts. Either this is a typo or some convoluted attempt to keep students from copying code from the Internet that doesn't make any sense. I've implemented binary heaps with 0- and 1-based indexing, sometimes with a -infinity sentinel at 0 for 1-based, and I've read many other people's binary heap implementations and never seen anything like this. – David Eisenstat Oct 09 '20 at 18:24
  • Thank you @DavidEisenstat ...I'd been wracking my brain all week on this and thought I was going crazy. I'm talking with the prof today about it but he's so far insisted that this is the way it should go (i.e. insert at 0, heapify down, and keep root index at 1). – python_girl Oct 09 '20 at 18:45
  • Are you permitted to use the "traditional" approach for `extractMin`, i.e., move the last element to the top and percolate down? – pjs Oct 09 '20 at 19:23
  • @pjs yes, that is a separate method that we have to implement to confirm that the priority queue works as expected. – python_girl Oct 09 '20 at 20:52
  • I'm not sure you answered my question. I understand that `extractMin` can be used to confirm the priority queue works, but are you allowed to perform the initial direct access of the n'th element? – pjs Oct 09 '20 at 20:58
  • @pjs I guess so....there's nothing in the assignment doc that says we can't. All we have is a description of the ExtractMin() method and that when applied it can be used to confirm that our PQ works. We also have to use it as part of the delete methods. – python_girl Oct 09 '20 at 21:39

1 Answers1

1

Taking you at your word that extractMin() can be implemented in the traditional fashion, I'm going to focus on how to implement a top-down add(elt) method.

A key aspect of guaranteeing O(log n) performance is to have the heap array be compact, so that a heap with n elements using index 1 as the root always has those elements stored in locations 1 through n inclusive. Normally that's achieved by adding a new element at index n + 1 and then percolating it up until the heap property is restored. To do this top-down, we want to consider transiting the same path, but from top to bottom rather than bottom to top. Either way, the set of vertices is the same and can be determined by successive halving of the destination vertex index n. Taking the name you chose as tacit permission to use Python, the following simple function generates the set of indices given argument n:

def index_path(n):
    bits = n.bit_length()
    return [n >> (bits - i) for i in range(1, bits)]

For instance, running index_path(42) yields [1, 2, 5, 10, 21], which you can easily confirm is the same set of indices evaluated in the bottom-up approach. As long as we stick with this path through the heap, we will maintain compactness.

The algorithm is then

  • Place the new element at index 0 of the array
  • Update n, the number of elements in the heap
  • Generate the set of indices leading from the top (1) up to but not including the last index (n)
  • Iterate through the index set
    • if the current iterate's value is ≤ the 0th value, advance to the next;
    • otherwise, swap the 0th value with the current iterate's value
  • Append or insert (as appropriate) the 0th value at index n

A quick implementation in Python:

class MyHeap:
    def __init__(self):
        self.ary = [None]
        self.size = 0

    def add(self, elt):
        self.ary[0] = elt
        self.size += 1
        for index in index_path(self.size):
            if self.ary[0] < self.ary[index]:
                self.ary[0], self.ary[index] = self.ary[index], self.ary[0]
        if len(self.ary) > self.size:
            self.ary[self.size] = self.ary[0]
        else:
            self.ary.append(self.ary[0])
        self.inspect()

    def inspect(self):
        print(self.ary, self.size)

Here's a test run:

test_data = [42, 42, 96, 17, 1, 3, 5, 29, 12, 39]
test_heap = MyHeap()
test_heap.inspect()
for x in test_data:   
    test_heap.add(x)

which produces:

[None] 0
[42, 42] 1
[42, 42, 42] 2
[96, 42, 42, 96] 3
[42, 17, 42, 96, 42] 4
[42, 1, 17, 96, 42, 42] 5
[96, 1, 17, 3, 42, 42, 96] 6
[5, 1, 17, 3, 42, 42, 96, 5] 7
[42, 1, 17, 3, 29, 42, 96, 5, 42] 8
[29, 1, 12, 3, 17, 42, 96, 5, 42, 29] 9
[42, 1, 12, 3, 17, 39, 96, 5, 42, 29, 42] 10

which can easily be confirmed to be a valid heap at each stage.

Unlike the bottom-up approach, this version can't short-circuit. It must transit the entire path every time.

pjs
  • 18,696
  • 4
  • 27
  • 56
  • Wow! Thank you for your explanation. For my assignment, I'm able to keep my existing code with the bottom-up approach but I'm definitely going to make a note of this for my knowledge. – python_girl Oct 10 '20 at 02:45
  • 1
    Glad to help. As a side note you might be potentially interested in, for any index `n` the bits represent a path for how to get to node `n` in the tree representation of the heap. Starting at node 0 and viewing the actual data locations of the heap as being a right subtree, you can read the "directions" starting from the leftmost bit of `n`. Values of 0 and 1 direct you to go to the left or right child, respectively. As an example, to navigate to node 11, which is 1011 in binary, you would go right, left, right, and right to get there. – pjs Oct 10 '20 at 03:59