5

I'm wondering whether there is a mapping between a sorted array (e.g., [1, 2, 3, 4, 5, 6]) and the representation that one obtains when one constructs a complete binary search tree from this sorted array, and expresses said binary search tree as an array (e.g., [4, 2, 6, 1, 3, 5], see graphic below)?

     4
  2     6
1   3  5

Here's some more context: It is well known that one can take a sorted array and construct a complete binary search tree from it (there is a unique representation). A recursive algorithm is: find the appropriate mid (this is actually quite tricky), treat it as the root, then recurse on left subarray and right subarray. From the resulting BST, one can perform a level-order traversal (basically breadth first search) to construct the array representation of the complete BST.

The reason I ask this is that this mapping is independent of the content of the array: it depends only on its length. Therefore I get the feeling that it should be possible to concisely express both arrays as a function of each other.

Any thoughts?

sga001
  • 158
  • 1
  • 9
  • 1
    Go with something implicite mapping that uses divide and conquer: Map the element number round(size/2) to the root, apply the implicite function on the left side of the array for the left child, the right side for the right child. – Aziuth Apr 05 '16 at 07:46
  • 1
    @Aziuth that won't result in a complete BST, but a balanced BST. complete BSTs are a subclass of balanced BST, not equivalent. –  Apr 05 '16 at 09:11

4 Answers4

4

The height of the tree is predictable roundUp(log2(nodes)). We know as well, that the right subtree is never greater than the left subtree - |LS| >= |RS|. Further more we can calculate the number of nodes that are missing to make the tree perfect: 2 ^ (height - 1) - arr.length. This allows us to predict how to distribute nodes among subtrees:

findRoot(int[] arr , int maxLeaves , int maxLevelL)
    //maxLeaves is the number of leaves on the maximum-level
    int l = min(maxLevelL / 2 , maxLeaves)
    return (arr.length - maxLeaves) / 2 + l

node buildTree(int[] arr , int maxLeaves , int maxLevelL)
    if maxLevelL == 0
        return null

    node result
    int rootidx = findRoot(arr , maxLeaves)

    result.val = arr[rootidx]

    result.left = buildTree(arr.subarray(0 , rootidx) , Math.min(maxLeaves , rootidx - 1) , maxLevelL / 2)
    result.right = buildTree(arr.subarray(rootidx + 1 , arr.length) , Math.max(0 , maxLeaves - rootidx - 1) , maxLevelL / 2)

    return node

The basic idea is the following: all complete BSTs share one property, regarding the recursive definition of a BST: (LS , R , RS) OR null, where LS and RS are the left and right subtree, which are defined as BSTs aswell. Both LS and RS are complete and at least one of them must be perfect. We can easily predict which of the two is perfect: on the highest level fit m nodes, but in the array we are missing x nodes to build a perfect tree. Thus:

if m - x == m / 2 then both are complete and the height of RS is height(LS) - 1
if m - x < m / 2 RS is perfect, LS only complete but not perfect
if m - x > m / 2 LS is perfect, RS only complete but not perfect
if m - x == 0 both LS and RS are perfect and of equal height

We can find the root of a tree using the following rule: Calculate the number of nodes on the left (l) and right (r) subtree that would be placed on the heighest level. Now we can easily remove those nodes from the tree, calculate the root of a perfect BST, and later on add the left and right nodes back into the tree implicitly: root = (arr.length - (l + r)) / 2 + l

E.g.:
Input:   1  2  3  4  5 
Nodes on maxLevel: 2
maxLevelL: 4

l = 2
r = 0

root_idx = (arr.length - (l + r)) / 2 + l =
     = (5 - 2) / 2 + 2 = 
     = 3

Apply this algorithm recursively to define subtrees:
...

result:
                  4
                /   \
               2     5
             /   \
            1     3

NOTE: I haven't tested this code. Might be that it still contains a few arithmetic insufficiencies that need to be fixed. The logic is correct, though. This should just represent a way of remapping the indices from one array to the other. The actual implementation might look quite a lot different from the code I provided.

After having this discussion for the second time, here's a definition of a complete BST:

In a complete binary tree every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.

from wikipedia

Complete BSTs are a subclass of balanced BSTs, with a few additional constraints, that allow a unique mapping of a complete BST to a sorted array and vice versa. Since complete BSTs are only a subclass of balanced BSTs, it won't suffice to build a balanced BST.

EDIT:
The above algorithm can be altered in the following way to directly build the array:

  • the root of the tree has index 0
  • the left child of the node with index n has index (n + 1) * 2 - 1
  • the right child of the node with index n has index (n + 1) * 2

Usually these access-operations are done on a 1-based array, but I've altered them to match a 0-based array for convenience

Thus we can reimplement buildTree to directly produce an array:

node buildTree(int[] arr , int maxLeaves , int maxLevelL , 
          int[] result , int nodeidx)
    if maxLevelL == 0
        return

    int rootidx = findRoot(arr , maxLeaves)

    //insert value into correct position of result-array
    result[nodeidx] = arr[rootidx]

    //build left subtree
    buildTree(arr.subarray(0 , rootidx) , Math.min(maxLeaves , rootidx - 1) , maxLevelL / 2 , 
              result , (nodeidx + 1) * 2 - 1)

    //build right subtree
    buildTree(arr.subarray(rootidx + 1 , arr.length) , Math.max(0 , maxLeaves - rootidx - 1) , maxLevelL / 2 ,
              result , (nodeidx + 1) * 2)

Note that unlike arr, we never use any subarrays of result. The indices of the respective nodes never change, throughout any method-calls.

  • You seem to be suggesting a way to build a complete BST from the sorted array. My question was slightly different. My intention was to construct a function F, such that F(old_idx, n) -> new_idx. Here old_idx represents the index in the input sorted array, and n is the length of the array. new_idx corresponds to the index of the value in the array representation of the complete BST. It is true that I could simply construct the tree (as you suggest), then do BFS (as I mentioned in the question) but this seems far too wasteful no? (still O(n) but tons of repeated work no?). – sga001 Apr 05 '16 at 18:59
  • Maybe a place to look for inspiration is in the cache oblivious datastructure literature... I'll report back if I can find something. – sga001 Apr 05 '16 at 19:03
  • @sga001 my code's actually only supposed to provide the algo for building a tree from the given array. How the tree is built in the end should be pretty simple to alter and is only implementation-specific. E.g. transforming the above code into a heap-construction that is array-based should be pretty simple. All you need to do is to remap each node to it's index in the resulting array. I can add that to the answer, if needed. –  Apr 05 '16 at 19:33
  • Fair point. I added my algo below. I believe it is similar to yours. Thanks. I guess the answer to this question really is there is no obvious closed-form expression to perform this mapping (in fact, there might not be one at all... but my theory background is too weak to know one way or the other). – sga001 Apr 05 '16 at 19:44
  • @sga001 there is. I've edited my post with an algo that directly solves your problem. Just as I said already in the answer, my code is only an example of how the transformation can be done. The actual tree can look quite a lot different, and all that needs to be done is to replace a few element-access with the respective operations. –  Apr 05 '16 at 19:50
0

Here's what I came up with. It is not ideal in that it is not the function I had in mind, but it saves the effort of building the tree and then creating the array from it.

find_idx(n) {
  if n == 1 { return 0; }

  h = ceil(lg(n+1)) // height of the tree
  f_h = floor(lg(n+1)) // height of the full portion (h or h-1)
  m_n = 2^h - 1 // # of nodes if tree were full
  f_n = 2^f_h -1 // # of nodes of full portion

  return floor(f_n / 2) + min(n - f_n, floor((m_n - f_n) / 2)
}

to_bst_array(array) {
   q = new empty queue
   res = resulting vector

   q.push(array)

   while !q.is_empty() {
     subarray = q.pop()
     idx = find_idx(subarray.len())

     res.push(subarray[idx])

     if subarray.len() > 1 {
       q.push(subarray[..idx]) // slice from 0 to idx
     }

     if subarray.len() > idx + 1 {
       q.push(subarray[idx + 1..]) // slice from idx+1 till end of subarray
     }
   }

   return res
}
sga001
  • 158
  • 1
  • 9
  • Actually I can add the code to directly build the array to my answer. It's just some minor alternations in my code. –  Apr 05 '16 at 19:40
0

that is my way to solve this task, hope you like it!)

def GenerateBBSTArray(a):
    a.sort()
    level = 0
    accum = []
    elements = []
    while len(a) // 2**level > 0:
        accum = [elem for elem in a[len(a) // 2**(level + 1)::(len(a) // 2**level) + 1]]
        elements.extend(accum)
        accum = []
        level += 1
    return elements
vladionair
  • 53
  • 6
-3

No Direct representation Between expressing Binary Tree Search (BST) and a direct Sort Array. The only relationship between the sorted array is when you run an in-order traversal on a BST and store it in an Array.

Ibukun Muyide
  • 1,294
  • 1
  • 15
  • 23
  • 1
    you're right. But that doesn't answer the question. For **complete** BSTs there exists a unique mapping between sorted array and BST. –  Apr 05 '16 at 09:06
  • there is no direct answer to this as it can be looked at from different perspective, in the real sense, there is no mapping between a sorted array and BST stored in an array, the array representation of a BST is similar to a Binary Heap but not necessary in sycn, it obvious that the size and some other details remain constant – Ibukun Muyide Apr 05 '16 at 09:28
  • 2
    you're right. But you didn't even **understand** the question, so your answer is pretty useless. The question isn't about BSTs in general, but about **complete** BSTs, which is quite a difference. For BSTs the mapping isn't unique, but there only exists one **complete** BST for each sorted array. There is a direct answer to the question - the actual question. –  Apr 05 '16 at 09:39