12

So I have been looking into implementing a lowest common ancestor algorithm. I looked at many different algorithms (mainly variations of Trajan's solution or variations of the RMQ).

I am using a non-binary tree. My tree will often change between queries and therefore pre-processing wouldn't necessarily be worthwhile. The tree shouldn't have more than 50-75 nodes. What I am wondering is whether I should bother using their algorithms or just stick with my own.

My Algorithm

myLCA(node1, node2) {
    parentNode := [ ]
    while (node1!=NULL) {
         parentNode.push(node1)
         node1 := node1.parent
    }
     while (node2!=NULL) {
         for i in parentNode.size {
             if (parentNode(i) == node2) {
                 return node2; 
             }
         }
         node2 := node2.parent
     }

}       
slimbo
  • 2,699
  • 4
  • 25
  • 36

6 Answers6

17

As others have mentioned, your algorithm is currently quadratic. That may not be a problem for a dataset as small as 50-75 nodes, but in any case it's straightforward to change it to linear time without using any sets or hashtables, just by recording the complete path to the root for each node, then walking back down from the root and looking for the first different node. The immediately preceding node (which is the common parent of these 2 different nodes) is then the LCA:

linearLCA(node1, node2) {
    parentNode1 := [ ]
    while (node1!=NULL) {
         parentNode1.push(node1)
         node1 := node1.parent
    }
    parentNode2 := [ ]
    while (node2!=NULL) {
         parentNode2.push(node2)
         node2 := node2.parent
    }
    while (node1 == node2 && !isEmpty(parentNode1) && !isEmpty(parentNode2)) {
        oldNode := node1
        node1 := parentNode1.pop()
        node2 := parentNode2.pop()
    }
    if (node1 == node2) return node1    // One node is descended from the other
    else return oldNode                 // Neither is descended from the other
}

EDIT 27/5/2012: Handle the case in which one node is descended from the other, which would otherwise result in attempting to pop() an empty stack. Thanks to damned for pointing this out. (I also realised that it's sufficient to track a single oldNode.)

j_random_hacker
  • 50,331
  • 10
  • 105
  • 169
  • Modification : In languages like java, pop() can throw EmptyStackException, which should be handled either by Exception Handling or by checking the sizes of both stacks before popping. – damned May 24 '12 at 08:51
  • @damned: Actually that's not necessary, because the fact that `node1` and `node2` belong to the same tree guarantees that a common ancestor will be found in the final `while` loop -- neither of those calls to `pop()` can possibly fail. – j_random_hacker May 25 '12 at 06:57
  • 2
    The exception will always occur if we have one node to be the descendant of the other. Take a parent-child pair for example. – damned May 25 '12 at 08:56
  • @damned: You're quite right, my apologies. I'll update in a moment to fix that. – j_random_hacker May 26 '12 at 13:59
  • still its not handling properly for one node is descended from other i think. Say list1 has 3 elements and list2 has 2 elements. After running while twice it has problem. – NRK Mar 13 '13 at 09:55
  • @NRK: Assuming you mean the case when `node1` is a child of `node2`: That case seems to work for me when I write down the states after each step. The `while` condition is tested 3 times: the first time, `node1` and `node2` are both `null`, the second time they are both equal to the root node, the third time they are both equal to the 2nd node in the path from the root (this will be `node2` itself) but `parentNode2` is empty so the loop exits and `if (node1 == node2) return node1` succeeds. Note that `node1` gets modified inside the function -- it won't return the original value passed in. – j_random_hacker Mar 13 '13 at 17:20
4

For a tree that small, I wouldn't bother implementing anything more complex. Your solution looks good, although the time complexity is squared in terms of the height of the tree. If you can easily implement a Set (most languages have it built-in), then the algorithm could be tweaked to,

  1. Traverse from the first node up to the root and collect all nodes in a set
  2. Traverse from the second node up to the root and check if the current node exists in that set. If it does, then that is the common ancestor.

Also, this algorithm assumes that a node can be its own ancestor. Otherwise, you would have to tweak the algorithm slightly. Consider this example,

A
|
B
|
C

When trying to find the lowest common ancestor of B, and C, this algorithm would report B, which may or may not be true depending on how you define ancestor.

Anurag
  • 140,337
  • 36
  • 221
  • 257
3

Without looking at the details of either algorithm, I'd suggest looking at how important the efficiency of this algorithms is to your overall application, and how much effort would be required to implement another algorithm.

How many times will this algorithm be run in normal (or stressed) operation of your application? Will it cause the user to wait longer than necessary? Are the other algorithms of a different order of magnitude faster than yours? (Someone familiar with the algorithms can give you a more detailed answer on this.)

I think it is not worth optimising a bit of code unless you will see sizable results (some people feel quite strongly that premature optamisation is the root of all evil)

BudgieInWA
  • 2,184
  • 1
  • 17
  • 31
2

Your algorithm is quadratic, but it can easily be made linear.

Just use hashtable (i.e. set) for parentNode, instead of list. Thus checking whether a node is in parentNode will be O(1) instead of O(n).

Petar Ivanov
  • 91,536
  • 11
  • 82
  • 95
2

I just wrote a blog post about how I had to implement my own algorithm for this problem but extended to a set of nodes with an arbitrary length. You can find it here (with a step-by-step graphical explanation of how it works)

http://bio4j.com/blog/2012/02/finding-the-lowest-common-ancestor-of-a-set-of-ncbi-taxonomy-nodes-with-bio4j/

Cheers,

Pablo

ppareja
  • 730
  • 1
  • 5
  • 16
1

I have one simplistic solution sort the two elements and the lowest be left and highest be right visit root def recurse(root) return nil if root.empty? if left <= root && right >= root return root elsif left <= root && right <= root recurse(root.left) else recurse(root.right) end

So this will check on each traversal The problem soluted in O(log n) time for average and worst and O(log