0

I've been playing with Binomial Heaps, and I encountered a problem I'd like to discuss here, because I think it may concern some user implementing a Binomial Heap data structure.

My aim is to have pointers to nodes, or handles, which directly points to Binomial Heap internal nodes, which in turn contain my priority value, which is usually an integer, when I insert them into a Binomial Heap. In this way I keep the pointer/handle to what I have inserted, and if it's that the case, I can delete the value directly using binomial_heap_delete_node(node), just like an iterator works.

As we'll see, having this is NOT possible with Binomial Heaps, and that's because of the architecture of this data structure.

The main problem with Binomial Heaps, is that at some point you'll need an operation: binomial_heap_swap_parent_and_child(parent, child), and you'll need this operation in both binomial_heap_decrease_key(node, key) and in binomial_heap_delete_node(node). The purpose of these operations are quite clear, by reading their names.

So, the problem is how binomial_heap_swap_parent_and_child(parent, child) works: in all the implementations I saw, it swaps the priority value between nodes, and NOT the nodes themselves. This will invalidate all of your pointers/handles/iterators to nodes, because they will still point to correct nodes, but those nodes won't have the same priority value you inserted before, but another one.

And this is quite logical, if we watch how Binomial Heaps (or Binomial Trees, in general) are structured: you have a parent node, treated by many children as "the parent", so many children points to it, but that parent node doesn't know how many children (or, more importantly, which children) are pointing to it, so it would be impossible to swap position of a node like this. Your only choice is to swap integer priority keys, but that will invalidate all pointers/handles/iterators to nodes.

NOTE: A possible solution would NOT be that instead of using binomial_heap_delete_node(node), one can just set priority of the node to remove to -999999999 (or such minimum values) and pop the minimum node out: this won't solve the problem, since binomial_heap_decrease_key(node, key) still needs the node parent-child swap operation, which the only solution is to swap integer priorities.

I want to know if someone has incurred in this problem before. I think the only solution is to use another heap structure, such as Binary Heap, Pairing Heap, or something else.

Marco Pagliaricci
  • 1,366
  • 17
  • 31

1 Answers1

0

As with many data structure problems, it's not hard to solve this one with an extra level of indirection. Define something like:

struct handle {
  struct heap_node *node;  // points to node that "owns" this handle
  struct user_data *data;  // priority key calculated from this
}

You provide users with handles (either by copy or pointer/reference; your choice).

Each internal node points to exactly one handle (i.e. node->handle->node == node). Two such pointers are swapped to exchange parent and child.

Several variations are possible. E.g. the data field could be the data itself rather than a pointer to it. The main idea is that adding the level of indirection between handles and nodes provides the necessary flexibility.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • Yeah, your solutions are pretty obvious, just like adding a vector of pointers in each node, having the pointers of children which point to that node as a parent. But in both adding more data per node, or adding more levels of indirection, I kill performance. So for now, the solution still is using another data structure which is NOT a binomial heap. – Marco Pagliaricci Nov 28 '15 at 21:10
  • @MarcoPagliaricci I guess it was so obvious you didn't mention it ;-). But you'll run into similar problems with other heaps. E.g. a simple binary heap maintained in the usual way as a complete binary tree in an array: where the basic implementation stores data in the array, to get delete and decrease key, you must store pointers (or array indices) in the array, referring to data elsewhere. And you must maintain a reverse map from data back into the complete tree array. Here's an implementation of mine (for a special purpose): https://github.com/gene-ressler/lulu/blob/master/ext/lulu/pq.c – Gene Nov 28 '15 at 23:29
  • @MarcoPagliaricci NB: Another point is that I'm not sure what you mean by "kill performance." With careful implementation, you're adding one pointer per node. It's a rare application indeed where 4 or 8 bytes per node and a single pointer dereference per access are going to cause a significant difference. That's exactly the overhead introduced by C++ object references, e.g. – Gene Nov 28 '15 at 23:37
  • Your code is very interesting, thanks for sharing. Seriously, I was so focused on the algorithmic thing, that I just considered all other solutions like adding more indirection levels etc trivial. Btw, you made a point here: other heap structures suffer of this limitation too, such as the binary heap implemented with a vector, as you mentioned. But not the binary heap implemented with a real binary tree, with nodes: in such a case I can get iterators working properly, since I don't swap priority values. I think we can consider this quest over :-) – Marco Pagliaricci Nov 29 '15 at 00:12
  • 1
    @MarcoPagliaricci But then you're giving up 2 pointers per node instead of 1! – Gene Nov 29 '15 at 00:25