0

Assume that a max heap with 10^6 elements is stored in a complete 7-ary tree. Approximately how many comparisons a call to removeMin() will make?

  • 5000
  • 50
  • 10^6
  • 500
  • 5

My solution: The number of comparisons should be equal to the number of leaf nodes at most because in max heap, the min. can be found at any of the leaf nodes which is not in the above options. Better approach was to take the square of ( log of 10^6 to the base 7) which gives 50 but this is only when we are sure that the minimum element will follow a single branch across tree which in the case of max heap is not correct. I hope that you can help.

user7610
  • 25,267
  • 15
  • 124
  • 150
Hashir
  • 390
  • 5
  • 20
  • log of 10^6 to the base 7 is about 7, not 50... – Yonlif May 06 '20 at 10:35
  • I think what the question is really about is "What is the time complexity of the operation of removing the min in a max heap?", and the answer to this is `O(n)`, so the answer should be `10^6`. – Yonlif May 06 '20 at 10:41
  • No. Firstly I said square of log 10^6 which is 50. Secondly, I am not asking for time complexity but the total number of comparisons that removeMin() will make. – Hashir May 06 '20 at 10:56
  • 1
    Your answer would be correct if you wrote "min heap" with "removeMin" or if you wrote "removeMax" with "max heap"... Although the fact that the "square of the log" matches is a coincidence – Matt Timmermans May 06 '20 at 11:17
  • Exactly my point. But this question was asked in my data structures exam so I am having a hard time to figure out removeMin() with max-heap. – Hashir May 06 '20 at 11:27

2 Answers2

2

There's no "natural" way to remove the minimum value from a max heap. You simply have to look at all the leaf nodes to figure out which one happens to be the minimum.

The question then is how many leaf nodes there are. Intuitively, we'd expect the fraction of nodes in the heap that are leaves to be pretty close to the total number of nodes. Take it to the limit - if you have a 1,000,000-ary heap, you'd have one node in the top layer and all remaining 999,999 elements in the next layer. Even in the smallest case where the heap is a binary heap, you'd expect roughly half the elements to be in the bottom layer.

More specifically, let's do some math! How many leaves will a 7-ary heap with n nodes have? Well, each node in the tree will either

  • be a leaf, or
  • have seven children,

with one possible exception that, since the bottommost row might not be full, there might be one node with fewer than seven children. Since that's just a one-off, we can ignore that last node when we're dealing with millions of elements. A quick proof by induction can be used to show that any tree where each node either has no children or seven children will have seven times as many leaf nodes as internal nodes (prove this!), so we'd expect than (7/8)ths of the nodes will be leaves, for a total of 875,000 leaves to check.

As a result, the best answer here would be roughly 106 comparisons.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • I made a c++ program to count number of leaf-nodes but my program was only able to count upto 10^5 of 7-ary tree. I dont remember exactly but the number was near 87,500. The difference of leaf nodes and **total nodes = 10^5** is around 12,500. 10^6 nodes tree should also follow the same pattern so do you think it is still sensible to choose 10^6 when the difference is in thousands nodes (comparisons). – Hashir May 06 '20 at 17:47
0

Min element can be any of the leaves of a max heap or any type, and there's no order there. All elements from A[10^6/7 + 1] onwards (where A is the array storing the leaves) are leaf nodes and need to be checked. This means 8571412 comparisons just to find the minimum. After that there is no simple way to 'remove' the minimum without introducing a gap that cannot be filled by simply shifting the leaves.

This is a misprint. Maybe the teacher wanted to ask removeMax, for which the answer is close to 50 -- see below:

There are 7 comparisons per level done by the heapify since each node has 7 children. If h is the height of the heap then that's 7*h comparisons.

Rough analysis: (here ~ means approximately) h ~ log_7(10^6) = 7.1, hence total comparisons 7*7.1 ~ 50

More accurate analysis: A 7-ary heap would have elements: 1 + 7 + 7^2 + ... + 7^h = 10^6

On the left side is a geometric series, that sums to: (7^h -1)/6 = 10^6

=> 7^h = 6*10^6 + 1 => h = lg_7(6*10^6 + 1) = 8 (approximately) , hence 7*8 = 56, still from the options 50 is the closest.

*A is array to sort heap.

Hashir
  • 390
  • 5
  • 20