4

I have implement a link-based BST (binary search tree) in C++ for one of my assignment. I have written my whole class and everything works good, but my assignment asks me to plot the run-times for:

a.  A sorted list of 50000, 75000, and 100000 items
b.  A random list of 50000, 75000, and 100000 items

That's fine, I can insert the numbers but it also asks me to call the FindHeight() and CountLeaves() methods on the tree. My problem is that I've implemented the two functions using recursion. Since I have a such a big list of numbers I'm getting getting a stackoverflow exception.

Here's my class definition:

template <class TItem>
class BinarySearchTree
{
public:
    struct BinarySearchTreeNode
    {
    public:
        TItem Data;
        BinarySearchTreeNode* LeftChild;
        BinarySearchTreeNode* RightChild;
    };

    BinarySearchTreeNode* RootNode;

    BinarySearchTree();
    ~BinarySearchTree();

    void InsertItem(TItem);

    void PrintTree();
    void PrintTree(BinarySearchTreeNode*);

    void DeleteTree();
    void DeleteTree(BinarySearchTreeNode*&);

    int CountLeaves();
    int CountLeaves(BinarySearchTreeNode*);

    int FindHeight();
    int FindHeight(BinarySearchTreeNode*);

    int SingleParents();
    int SingleParents(BinarySearchTreeNode*);

    TItem FindMin();
    TItem FindMin(BinarySearchTreeNode*);

    TItem FindMax();
    TItem FindMax(BinarySearchTreeNode*);
};

FindHeight() Implementation

template <class TItem>
int BinarySearchTree<TItem>::FindHeight()
{
    return FindHeight(RootNode);
}

template <class TItem>
int BinarySearchTree<TItem>::FindHeight(BinarySearchTreeNode* Node)
{
    if(Node == NULL)
        return 0;

    return 1 + max(FindHeight(Node->LeftChild), FindHeight(Node->RightChild));
}

CountLeaves() implementation

template <class TItem>
int BinarySearchTree<TItem>::CountLeaves()
{
    return CountLeaves(RootNode);
}

template <class TItem>
int BinarySearchTree<TItem>::CountLeaves(BinarySearchTreeNode* Node)
{
    if(Node == NULL)
        return 0;
    else if(Node->LeftChild == NULL && Node->RightChild == NULL)
        return 1;
    else
        return CountLeaves(Node->LeftChild) + CountLeaves(Node->RightChild);
}

I tried to think of how I can implement the two methods without recursion but I'm completely stumped. Anyone have any ideas?

Kev
  • 118,037
  • 53
  • 300
  • 385
Saad Imran.
  • 4,480
  • 2
  • 23
  • 33
  • 4
    (a) Question title should describe the question, not your emotional state/hopes&dreams; (b) we don't care when your assignment is due!; (c) no need to sign posts. Good luck! – Lightness Races in Orbit Nov 09 '11 at 23:53
  • @SaadImran: is it safe to say that the insert does no balancing whatsoever? – Mooing Duck Nov 10 '11 at 00:11
  • @SaadImran: on a whim, does it still stackoverflow if you put the word `static` before the function prototypes taking a `BinarySearchTreeNode*` in release builds? eg: `static void PrintTree(BinarySearchTreeNode*);` – Mooing Duck Nov 10 '11 at 00:20
  • @MooingDuck Yes, you're right, I didn't really think far enough to balance the tree, I just inserted numbers 1-100,000 into the tree. I will try balancing as well as adding the static keyword. Thanks! – Saad Imran. Nov 10 '11 at 01:01
  • Look into using red-black semantics for balancing. With a properly balanced tree, you only have to do `ceil(lg n)` levels of recursion. Even if the function is non-static, you shouldn't run out of stack space. – moshbear Nov 10 '11 at 01:06
  • @TomalakGeret'kal I didn't word the question to express my dreams, but to draw attention, I see the problem here, I won't do it next time. I know people don't care when my assignment is due, I just put it there thinking people might help faster. I don't see anything wrong with signing a post, although all I said was I appreciate any help in the problem. Anyways, not here to argue, I will try to follow those tips next time. Thanks. – Saad Imran. Nov 10 '11 at 01:11
  • @moshbear: I believe the upper-bound of red-black trees is `ceil(2*lg(n))` isn't it? – Mooing Duck Nov 10 '11 at 01:17
  • @MooingDuck insertion or search? In any case, it should alleviate the stack overflow. – moshbear Nov 10 '11 at 01:25
  • @moshbear: depth, and yes it solves the problem – Mooing Duck Nov 10 '11 at 05:08
  • Stil better than `O(n)` by a long shot. – moshbear Nov 10 '11 at 05:55

5 Answers5

3

Recursion on a tree with 100,000 nodes should not be a problem if it is balanced. The depth would only be maybe 17, which would not use very much stack in the implementations shown. (log2(100,000) = 16.61). So it seems that maybe the code that is building the tree is not balancing it correctly.

Mark Wilkins
  • 40,729
  • 5
  • 57
  • 110
  • or more likely, insert is not balancing at all. – Mooing Duck Nov 10 '11 at 00:09
  • @Mooing: I agree that is quite possible. Which means that for a sorted list, the BST would basically result in one long linked list of 100,000 items. – Mark Wilkins Nov 10 '11 at 00:11
  • Umm, I didn't think this far. I was looping from 0 to 100,00 and inserting the loop control variable so all nodes in my list just branched off to the right. I will try balancing the tree. Thanks for the help. – Saad Imran. Nov 10 '11 at 00:59
2

I found this page very enlightening because it talks about the mechanics of converting a function that uses recursion to one that uses iteration.

It has examples showing code as well.

karlphillip
  • 92,053
  • 36
  • 243
  • 426
  • That only shows tail-recursion, while both of his are Binary-Recursive, which is very hard to convert to iteration without building a stack somewhere else. – Mooing Duck Nov 10 '11 at 00:23
  • Very hard? I think you are exaggerating. – karlphillip Nov 10 '11 at 00:29
  • It's significantly harder than building a stack. However, building a stack might be the easiest (if not most correct) route for him at this point. – Mooing Duck Nov 10 '11 at 00:37
  • There's one function where making an explicit stack in iteration is significantly more complex than utilizing the implicit stack in recursion: the Ackermann function. – moshbear Nov 10 '11 at 01:08
  • @moshbear: I'll concede that such an example might exist, I've simply never encountered or heard of one. Ackermann function would makes sense though. – Mooing Duck Nov 10 '11 at 01:19
1

In order to count the leaves without recursion, use the concept of an iterator like the STL uses for the RB-tree underlying std::set and std::map ... Create a begin() and end() function for you tree that indentifies the ordered first and last node (in this case the left-most node and then the right-most node). Then create a function called

BinarySearchTreeNode* increment(const BinarySearchTreeNode* current_node)

that for a given current_node, will return a pointer to the next node in the tree. Keep in mind for this implementation to work, you will need an extra parent pointer in your node type to aid in the iteration process.

Your algorithm for increment() would look something like the following:

  1. Check to see if there is a right-child to the current node.
  2. If there is a right-child, use a while-loop to find the left-most node of that right subtree. This will be the "next" node. Otherwise go to step #3.
  3. If there is no right-child on the current node, then check to see if the current node is the left-child of its parent node.
  4. If step #3 is true, then the "next" node is the parent node, so you can stop at this point, otherwise go the next step.
  5. If the step #3 was false, then the current node is the right-child of the parent. Thus you will need to keep moving up to the next parent node using a while loop until you come across a node that is a left-child of its parent node. The parent of this left-child node will then be the "next" node, and you can stop.
  6. Finally, if step #5 returns you to the root, then the current node is the last node in the tree, and the iterator has reached the end of the tree.

Finally you'll need a bool leaf(const BinarySearchTreeNode* current_node) function that will test whether a given node is a leaf node. Thus you counter function can simply iterate though the tree and find all the leaf nodes, returning a final count once it's done.

If you want to measure the maximum depth of an unbalanced tree without recursion, you will, in your tree's insert() function, need to keep track of the depth that a node was inserted at. This can simply be a variable in your node type that is set when the node is inserted in the tree. You can then iterate through the three, and find the maximum depth of a leaf-node.

BTW, the complexity of this method is unfortunately going to be O(N) ... nowhere near as nice as O(log N).

Jason
  • 31,834
  • 7
  • 59
  • 78
1

May be you need to calculate this while doing the insert. Store the heights of nodes, i.e add an integer field like height in the Node object. Also have counters height and leaves for the tree. When you insert a node, if its parent is (was) a leaf, the leaf count doesnt change, but if not, increase leaf count by 1. Also the height of the new node is parent's height + 1, hence if that is greater than the current height of the tree, then update it. Its a homework, so i wont help with the actual code

Adithya Surampudi
  • 4,354
  • 1
  • 17
  • 17
  • Thanks, I did consider this, but I assumed since he wanted a function he actually wanted us to traverse the tree and calculate it at the end. I will probably end up doing this, if I cannot figure out how to balance it properly in time. Thanks! – Saad Imran. Nov 10 '11 at 01:06
1

Balance your tree occasionally. If your tree is getting stackoverflow on FindHeight(), that means your tree is way unbalanced. If the tree is balanced it should only have a depth of about 20 nodes for 100000 elements.

The easiest (but fairly slow) way of re-balancing unbalanced binary tree is to allocate an array of TItem big enough to hold all of the data in the tree, insert all of your data into it in sorted order, and delete all of the nodes. Then rebuild the tree from the array recursively. The root is the node in the middle. root->left is the middle of the left half, root->right is the middle of the right half. Repeat recursively. This is the easiest way to rebalance, but it is slowish and takes lots of memory temporarily. On the other hand, you only have to do this when you detect that the tree is very unbalanced, (depth on insert is more than 100).

The other (better) option is to balance during inserts. The most intuitive way to do this is to keep track of how many nodes are beneath the current node. If the right child has more than twice as many "child" nodes as the left child, "rotate" left. And vice-versa. There's instrcutions on how to do tree rotates all over the internet. This makes inserts slightly slower, but then you don't have occassional massive stalls that the first option creates. On the other hand, you have to constantly update all of the "children" counts as you do the rotates, which isn't trivial.

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158