4

The problem

Given a rooted tree with n nodes, where all leaves are labelled from a set of labels. Build a datastructure which, given a leaf node, a and a label l, can find the lowest ancestor, u, of a, where u has at least one descendant with label l.

Linear Space / Linear Time approach

  • Start at leaf a
  • Examine all siblings of a
    • If a sibling has label l find the lowest-common-ancestor between a and that sibling.
    • Otherwise continue to parents
  • If all leaf-nodes descending from parents are not labelled l, continue to the grandparents and check their leaf-node descendants.
  • Continue recursively checking greater-grandparents and all their descendant leaf-nodes until an l-labelled descendant is found.

This has time complexity O(n) and space complexity O(n).


Is there a faster way to do this with linear space complexity? Perhaps by preproccessing the tree somehow? l and a are not fixed so the pre-processing has to be flexible.

The lowest common ancestor can be found in constant time using RMQ via Eulerian-Tour.

Keep in mind the tree is not balanced or sorted in any way.

DannyDannyDanny
  • 838
  • 9
  • 26
  • If `a` and `l` are siblings, then by definition they share the same parent, which is the lowest common ancestor. What do you mean here by "sibling" of `a`? Also, do your nodes have parent pointers? – Jim Mischel Jul 26 '18 at 13:27
  • @JimMischel Yes, if **a** and some node with label **l** are siblings, then yes, the parent would be the lowest common ancestor. I just used "lower common ancestor" so that it would apply for any two leaf nodes. – DannyDannyDanny Jul 26 '18 at 13:46
  • If you're willing to spend the time and memory in preprocessing, you can create a lookup table that gives you O(1) query. The memory requirement would be O(m^2), where m is the number of leaf nodes. – Jim Mischel Jul 26 '18 at 15:18
  • Ah yes, however the algorithm is intended for arbitrary trees with no additional data structures. So the pre-processing would factor the time complexity. Thanks though! – DannyDannyDanny Jul 27 '18 at 11:53
  • If you want to do a single query against a tree, then your worst case time complexity is O(n), because you potentially have to look at every node. And worst case space complexity is O(n) because in a degenerate tree you potentially have to maintain a path that contains every node in the tree. I don't think you can improve the complexity, given the constraints you've outlined. – Jim Mischel Jul 27 '18 at 13:49

2 Answers2

1

So, now I found a better solution:

The idea is the following: the further two nodes appear in the Euler Path, the higher their LCA is. I.e. index(a) < index(b) < index(c) => dist_to_root(LCA(a, b)) >= dist_to_root(LCA(a, c)).

This means that you only have to compute the LCA of a and the first node after a with the label l in the path, and the LCA of a and the last node before a with the label l in the path.

One of them will give the optimal solution to the problem.

To find these two indices efficiently, create a list of indices for each label, and perform a binary search in O(log n).

Memory complexity is O(n).

Jakube
  • 3,353
  • 3
  • 23
  • 40
  • But doesn't creating a list of indices take O(n) time? – Jim Mischel Jul 26 '18 at 15:12
  • 1
    @JimMischel Yes, but you only need to do it once during preprocessing. – Jakube Jul 26 '18 at 15:14
  • Note that LCA = "lowest common ancestor:, and "Euler Path" = "Euler Tour representation of the tree" (https://en.wikipedia.org/wiki/Euler_tour_technique), which also enables transforming the LCA problem in to a "range minimum query", which you can do in O(log N) as well. See https://www.geeksforgeeks.org/find-lca-in-binary-tree-using-rmq/ and https://en.wikipedia.org/wiki/Range_minimum_query – Matt Timmermans Jul 27 '18 at 05:33
  • @MattTimmermans Yes, I know. It can even be done in O(1). The OP already wrote that he know how to do this task, so I didn't explained in my answer. – Jakube Jul 27 '18 at 10:07
  • @user38034 Thanky you! I'll give this a shot. Yes, by calculating the Eulerian Path and BST in pre-processing, the query can be done in **O(log n)** time complexity. Considering your reply, this could perhaps be pushed down to **O(log log n)** using [y-fast-tries](https://en.wikipedia.org/wiki/Y-fast_trie). – DannyDannyDanny Jul 27 '18 at 12:10
  • @DannyDannyDanny Just a reminder: here on Stackoverflow there is this concept of upvotes and accepting answers, which usually means a lot more than a thank you :-P – Jakube Jul 27 '18 at 12:23
0

Here is a solution with O(log(n)^3) time complexity and O(n log(n)) space complexity.

Let L be the list of labels that you encounter on the Eulerian Path. You build a Segment Tree with this list, and store in each node of the tree the set of labels appearing in the corresponding segment. Then you can check in O(log(n)^2) time, if a label appears in a subtree via a range query in the segment tree.

To find the correct parent, you can do a binary search. E.g. something similar to binary lifting. Which will add another factor of log(n).

Jakube
  • 3,353
  • 3
  • 23
  • 40