3

So, I have the obvious brute-force algorithm, which goes as follows

int isSubtree (binTree *S, binTree *T)
{
    if (S == NULL)
        return 0;
    return (isEqual (S,T) || isSubtree (S->left, T) || isSubtree (S->right, T));
}

int isEqual (binTree *S, bintree *T)
{
    if (S==NULL && T==NULL)
        return 1;
    if (S==NULL || T==NULL)
        return 0;
    if (S->val == T->val)
        return isEqual(S->left,T->left) && isEqual (S->right,T->right);
    else
        return 0;
}

But this is O(n²) approach.

I have another approach which goes as follows and is O(n) We, traverse the first tree in inorder fashion and store it in an array. Then we traverse the second tree and store it in inorder fashion. Now if the second array is a subarray of the first, we go ahead and repeat the same procudure for preorder traversal too. If both the queries result TRUE, The tree is subtree of the first tree. Otherwise, not.

Can somebody tell me whether the following algorithm would work or not?

And is there a more space optimized solution to this problem?

Note: I need two arrays, since I am storing the traversals for both the arrays, is there anyway I could just do with one array? Like I would store the inorder traversal of one of the trees, and then use that array to check for the subarray condition while traversing the other tree. Or maybe no extra space but O(n) time complexity?

Note: By sub-array, I mean that the elements should occur consecutively, i.e

{2,3,5} is a subarray of {1,2,3,5} but not a subarray of {1,2,3,4,5}
user2560730
  • 345
  • 4
  • 11
  • Your brute force algo is of complexity O(n) only. How do you say it is n^2 soln? – Karthikeyan Jul 22 '13 at 04:33
  • In every recursion, you are effectively removing a node. So, it is just O(n) and this needs to done 3 times. – Karthikeyan Jul 22 '13 at 04:35
  • 1
    @Karthikeyan: imagine that all nodes in `S` hold 1, while all nodes in `T` are 1 except that one rightmost child holds 2. Then every `isEqual` check would traverse all nodes of `T`, only to discover a mismatch at the very end. This is repeated for every node in `S`. – Igor Tandetnik Jul 22 '13 at 04:37
  • Got it, I am sorry, my bad.. – Karthikeyan Jul 22 '13 at 04:38
  • isEqual() function is O(n) and in worst case we are traversing each node in isSubtree(), we would get an overall time complexity of O(n²), wouldn't we? – user2560730 Jul 22 '13 at 04:39
  • Basically, your naive algorithm is equivalent to a naive substring search, which is O(n^2). To improve on that, you would probably need something like Boyer-Moore. I suspect it might be possible to adapt Boyer-Moore or another string search to run directly against the tree structure, without explicitly materializing the traversal. – Igor Tandetnik Jul 22 '13 at 04:43
  • Any comments on the second algorithm I mentioned? Does that seem correct to you? – user2560730 Jul 22 '13 at 04:45
  • Wouldn't pre order traversal alone make sure the tree equivalence? – Karthikeyan Jul 22 '13 at 04:48
  • It looks OK, but note that, again, you would need Boyer-Moore or similar in order to check faster than O(n^2) time whether one sequence is a sub-sequence of the other. – Igor Tandetnik Jul 22 '13 at 04:49
  • O(n^2) is an overestimate. O(n*k) is a better estimate, where k is the number of nodes with equivalent values to the root of the second tree. Even then, the algorithm will perform linear in practice, as k << n. – pippin1289 Jul 22 '13 at 04:49
  • Okay, KMP algorithm can provide me a O(n) time. However, if the trees contain unique element, a simple bruteforce algo will result in O(n) – user2560730 Jul 22 '13 at 04:51
  • @karthikeyan: It won't. I will result in false positives. – user2560730 Jul 22 '13 at 04:51
  • @Karthikeyan: a tree with 1 in the root and 2, 3 as children has the same pre-order traversal as a tree with 1 in the root, 2 as its only child, and 3 as the child of that. In general, multiple trees can happily have the same in-order traversal. – Igor Tandetnik Jul 22 '13 at 04:52
  • If you know that the trees contain unique elements, then your original brute-force algorithm would work just fine. In fact, you can just find a note in S that has the same value as the root in T, and run `isEqual` on these two nodes. – Igor Tandetnik Jul 22 '13 at 04:55
  • That is what I am doing anyhow. The only difference being, I am doing that check of equal values in the function isEqual(), rather than checking it in isSubtree(). The moment it see two unequal nodes in isEqual(), it returns 0. So the only difference would be that of a function call. But I am still not entirely convinced that the algorithm will be O(n) and not O(n²). – user2560730 Jul 22 '13 at 04:58
  • Wait a minute. If `T` consists of a single node with value 1, and `S` happens to have value 1 at the root (but has other nodes), is such T considered a subtree of such S? Your code says "no", but your traversal-and-subsequence based algorithms would say "yes". – Igor Tandetnik Jul 22 '13 at 05:03
  • It should return 0 according to the question. Okay I didn't consider that case. I guess my 2nd algorithm is then infact incorrect afterall. – user2560730 Jul 22 '13 at 05:04
  • @Igor, may be inorder traversal wouldn't give right check, I just wonder only preorder traversal, wouldn't make it, Perhaps I can check this fully, thanks – Karthikeyan Jul 22 '13 at 05:05
  • Yes, brute-force is clearly linear when all elements are unique. Top-level `isEqual` call would perform O(1) work on all nodes in S but one (the one that matches T's root). On that single node, it would do work proportional to the size of T, at worst. For a total of O(sizeof(S) + sizeof(T)). – Igor Tandetnik Jul 22 '13 at 05:08

4 Answers4

1

Summary: consider storing a hash and/or the sub-tree size in each node to speed searches. Your proposed algorithm is broken.

Your proposed algorithm - broken?

If I've understood your proposed alternative algorithm correctly, then it doesn't works. As a counter example, consider:

  T          S
  x          x
 / \        / \
y   z      y   z
                \
                 q

T has inorder traversal yxz, preorder xyz. S has inorder traversal yxzq, preorder xyzq.

So, T's traversals are found embedded in S's, despite T not being a valid match (as per your recursive approach).

Quickly eliminating subtrees during a recursive matching process

I'd been thinking along the lines of Karthikeyan's suggestion - store subtree depth at each node, as it lets you elimate a lot of comparisons. Of course, if maintained dynamically it makes certain tree operations more expensive too - have to prioritorise either those or the extra hit during subtree finds.

Storing a hash of subtree elements is another possibility. What makes sense depends how dynamically the tree's structure and data is updated compared to the subtree finds, and whether either is more crucial from an overall perforamnce perspective.

Further reading

Anyway, there are lots of existing questions about this, e.g. Find whether a tree is a subtree of other. Ohhh - found this too - Determine if a binary tree is subtree of another binary tree using pre-order and in-order strings - which seems to support my logic above given you're saying the recursive approach is correct but slow.

Community
  • 1
  • 1
Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • A user pointer that out in the comments. The algorithm is infact incorrect. I'm looking upon Karthikeyan's solution and trying to come up with the algo. – user2560730 Jul 22 '13 at 05:22
0

What about doing a depth first search and store the number of subtree nodes at every node and compare only the sub trees of parent whose nodes count is equivalent to the other tree under question.

Karthikeyan
  • 990
  • 5
  • 12
  • Imagine that both trees are complete binary trees, with every node (except leaves) having exactly two children. – Igor Tandetnik Jul 22 '13 at 04:53
  • Ah, wait, I take the downvote back. In fact, I think I misunderstood the original problem. It looks like T is only considered a subtree of S when leaves line up with leaves (so that a tree with a single element that happens to match a root in S is not in fact considered a subtree of S). – Igor Tandetnik Jul 22 '13 at 05:00
  • I am not sure if I follow you. Could you back your approach with a code or maybe just a rough algorithm? – user2560730 Jul 22 '13 at 05:00
  • Simple, have a dictionary of every node pointer along with the subtree count of that node. and check for the subtrees whose count is equivalent to subtree to which comparision has to be made. Do a tree equivalence check which O(n) to just confirm whether they are equal or not. – Karthikeyan Jul 22 '13 at 05:03
0

We can use inorder traversal and DFS( in a binary tree, it reduces to preorder traversal). Now first with DFS, modify the data structure of both the trees and at each node store the no of sub trees under it. After that write the inorder traversal for both the trees and then match the strings with KMP. In O(n+m) (n & m- nodes in both the trees), we can find out the different matchings. We can use hashing to connect to the modified graph with DFS. On each matching with KMP, we can compare the DFS modified graphs of the two (on the no. of subtrees) and if it matches as well for the whole sequence, it is a subtree, else we go for another match of KMP and so on.

In the above ex, the modified data structure fot 'T' after DFS is [x(2);y(0);z(0)] & for 'S' [x(3);y(0);z(1);q(0)]. Inorder for 'T': yxz Inorder for 'S': yxzq

We get the match 'yxz'. Now we go to the DFS modified structure. There is a mismatch at x; So, 'T' is not a subtree of 'S'.

Tanu Saxena
  • 779
  • 6
  • 15
0

In array {2,3,5}, root can be 2, or 3, or 5, so you can't represent a tree use a array such as this.

If A is subtree of B(similar as your code), and assume leafs(x) is array of "tree x's leaf nodes" from left to right, then leafs(A) is substring of leafs(B).

Once you find a substring as above, check nodes from leaf up to root to ensure it's really a subtree.

SliceSort
  • 357
  • 3
  • 5