3

I am trying to implement a function in C that will find the smallest int that is greater than or equal to a given int in an AVL. For example:

  • if I have an AVL tree consisting of 1,2,3,4,5,6,7 and I put in 6, it should return 6.

  • if I have an AVL tree consisting of 1,2,3,4,6,7 and I put in 5, it should return 6.

  • if none are found, return -1.

I have found a case (there could be more) where this implementation fails. If I have an AVL tree of 1,2,3,4,5,6,7 and I input 3 it incorrectly returns 4. This case occurs when the ROOT is bigger than the input. I am not sure how to fix this though. There could also be other cases — if you could let me know that would be great.

Here is my attempt:

int findLeastGreatest(Node *root, int input) {
    // Never found
    if (root->left == NULL && root->right == NULL && root->data < input)
        return -1;
    // Found
    if ((root->data >= input && root->left == NULL) ||
        (root->data >= input && root->left->data < input)) 
        return root->data;

    if (root->data <= input)     
        return findLeastGreatest(root->right, input);
    else        
        return findLeastGreatest(root->left, input);
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    I think the `|| (root->data >= input && root->left->data < input)` term is at least suspicious. It isn't clear whether it's an attempt at optimization or a key part of the algorithm. I wonder if the `if (root->data <= input)` condition should use `<` instead of `<=`? If it's equal, you should not need to search further — which probably has ramifications. If the search of the left tree finds nothing (returns `-1`) but the current node is bigger than the search item, you should return the current node, should you not? – Jonathan Leffler Oct 21 '22 at 19:35
  • Once you've found a node whose `data` is greater than the input, and the left child's `data` is less than the input, you need to check the rightmost leaf under that left child. So you can't just return `root->data` when `root->data >= input && root->left_data < input` – user3386109 Oct 21 '22 at 19:36
  • I think the algorithm works on any BST — binary search tree — and does not in any way rely on the tree being an AVL tree (which is a balanced form of a BST, and hence is a BST). – Jonathan Leffler Oct 21 '22 at 19:40

2 Answers2

3

Your function has problems: you are testing too many conditions together:

Here is a simpler approach:

  • if the root is NULL, you should return -1;
  • if the root->data < input, you should just recurse on the root->right node
  • if root->data == input you should just return input.
  • otherwise, you should recurse on the left node and return the result if found, otherwise return root->data.

Here is an implementation:

int findLeastGreatest(const Node *root, int input) {
    if (!root)
        return -1;
    if (root->data < input)
        return findLeastGreatest(root->right, input);
    if (root->data == input)
        return input;
    int value = findLeastGreatest(root->left, input);
    if (value == -1)
        return root->data;
    else
        return value;
}

If you are not required to produce a recursive version, here is a simpler version with a while loop:

int findLeastGreatest(const Node *root, int input) {
    int value = -1;
    while (root) {
        if (root->data < input) {
            root = root->right;
        } else {
            value = root->data;
            root = root->left;
        }
    }
    return value;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Yes, the complexity is **O(log n)** because the function recurses at most once per call, so it follows a path in the tree which is by construction limited by `1 + log2 n` if the tree is properly balanced. If the tree is not balanced, the complexity can reach **O(n)**. – chqrlie Oct 21 '22 at 20:06
  • @chemslatt123: can you post the actual program code? – chqrlie Oct 21 '22 at 20:26
  • @chemslatt123: yes, but there might be other problems in the rest of the testing code :) – chqrlie Oct 21 '22 at 20:31
  • 1
    @chemslatt123: if you get a runtime error (a segmentation fault I presume) on this: `if (root->data < input)`, it means `root` is an invalid pointer. Since `root` cannot be a null pointer after the previous test, it seems your AVL tree construction code produced an invalid pointer somewhere in the tree. Try and print the offending tree before calling `findLeastGreatest()` – chqrlie Oct 21 '22 at 20:38
2

I find it easier to write this function in a loop. The algorithm in the pseudocode below should work. The key idea is to not assign to bound unless the condition (node.key >= key) is true, in which case you must also traverse left to look for potentially smaller keys that satisfy the same condition. Otherwise, traverse right to look for larger keys that might satisfy.

least_greatest(node, key):
  bound = -1
  while node != NULL:
    if node.key >= key:
      bound = node.key  # found a bound, but it might not be the least bound
      node = node.left  # look for a smaller key
    else:
      node = node.right  # look for larger keys
  return bound

P.S. this function is called upper_bound in the C++ STL, and I've also seen this called "least upper bound".

MattArmstrong
  • 349
  • 3
  • 9