5

In this AVL tree implementation from Solaris, struct avl_node is defined in an obvious way if compiling for 32-bit library.

But for 64-library a pointer to node's parent is packed into "avl_pcb". And it looks like only 61 bits of a ponter are stored.

  1. Why this does work?
  2. Why not make similar thing for 32-bit?
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065

1 Answers1

9

On a 64-bit machine, pointers are usually aligned to be at word boundaries, which are at multiples of eight bytes. As a result, the lowest three bits of a pointer will be zero. Consequently, if a data structure needs three bits of information, it can pack them into the lowest three bits of a pointer. That way:

  • To follow the pointer, clear the lowest three bits of the pointer value, then dereferences it.
  • To read any of the three bits, mask out the rest of the bits in the pointer and read them directly.

This approach is pretty standard and doesn't lose any ability to point to addresses, since usually for performance or hardware reasons you wouldn't want to have non-aligned pointers anyway.

What I'm not sure about is why they didn't do this in the 32-bit case, since with three pointers they could easily hide the necessary bits using the same trick but with two bits per pointer. My guess is that it's a performance thing: if you do pack bits into the bottom of pointers, you increase the cost of following the pointer because of the computation necessary to clear the bits. Note, for instance, that in the 64-bit case that the bits are packed into the parent pointer, which is only used for uncommon operations like computing inorder successors or doing rotations on an insert or delete. This keeps lookups fast. In the 32-bit case, to hide 3 bits, the implementation would need to use the lower bits of two pointers, one of which would have to be the left or right pointer. My guess is that the performance hit of slowing down tree searches wasn't worth the space savings, so they decided to just take the memory hit and store them separately. This is just speculation, though, since they absolutely could have stored the bits in the bottoms of their pointers if they wanted to.

On a side note: if the implementation was using a red/black tree rather than an AVL tree, then only two bits of information would be necessary: a bit to tell if the node is red or black, and a bit to tell whether the node is a left or right child. In that case, the two bits required could always be packed into a 32-bit pointer. This is one reason why red-black trees are popular.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Note that the source file is in usr/src/uts/... which is the kernel source tree, not userspace libraries, so there may simply have not been a need to optimize the 32-bit implementation in the use cases they had when it was designed. – alanc Jan 12 '13 at 19:47
  • sys/avl.h is heavily used in userspace too, e. g. by runtime linker and some other libraries. –  Jan 14 '13 at 07:29
  • ? but how much memory is required per Node for AVL? seems, 2 bits too – 4esn0k Feb 25 '13 at 04:22