Why is inorder and preorder traversal useful for creating an algorithm to decide if T2 is a subtree of T1

Question

I'm looking at an interview book and the question is:

You have two very large binary trees: T1, with millions of nodes, and T2, with hundreds of nodes. Create an algorithm to decide if T2 is a subtree of T1.

The authors mentions this as a possible solution:

Note that the problem here specifies that T1 has millions of nodes—this means that we should be careful of how much space we use. Let’s say, for example, T1 has 10 million nodes—this means that the data alone is about 40 mb. We could create a string representing the inorder and preorder traversals. If T2’s preorder traversal is a substring of T1’s preorder traversal, and T2’s inorder traversal is a substring of T1’s inorder traversal, then T2 is a substring of T1.

I'm not quite sure the logic behind as to why if these are true:

T2-preorder-traversal-string is a substring of T1-preorder-traversal-string
T2-inorder-traversal-string is a substring of T1-inorder-traversal-string

That T2 must be a substring (although I assume the author means subtree) of T1. Can I get an explanation to this logic?

EDIT: User BartoszMarcinkowski brings up a good point. Assume both trees have no duplicate nodes.

Assuming you got this question from Cracking the Coding interview, the author actually does mention that the trees can have duplicate nodes, and even shows an example of it. The way she resolved is by also printing the null values for leave nodes as well. — Cheng, Sep 28 '14 at 18:46

score 4 · Answer 1 · edited Jan 20 '14 at 16:45

4

I think it is not true. Consider:

T2:

  2
 / \
1   3

inorder 123 preorder 213

and

T1:

      0
     / \
    3   3
   / \ 
  1   1
 / \ 
0   2


inorder 0123103 preorder 0310213

123 is substring of 0123103, 213 is substring of 0310213, but T2 is not subtree of T1.

edited Jan 20 '14 at 16:45

Bernhard Barker

54,589
14
104
138

answered Jan 20 '14 at 16:40

Bartosz Marcinkowski

6,651
4
39
69

2

I'm pretty sure one of the constraints is no duplicate nodes. – Daniel Imms Jan 20 '14 at 16:44
Then it would be quite obvious :) – Bartosz Marcinkowski Jan 20 '14 at 16:47
1

I would also expect this assumption, but +1 for a good counterexample. – Łukasz Kidziński Jan 20 '14 at 16:56

score 1 · Answer 2 · edited May 23 '17 at 12:13

1

Here is a counter-example to the method.

Consider the tree T1:

  B
 / \
A   D
   / \
  C   E
       \
        F

And the sub-tree T2:

  D
 / \
C   E

The relevant traversals are:

T1 pre-order: BADCEF
T2 pre-order: DCE
T1 in-order: ABCDEF
T2 in-order: CDE

While DCE is in BADCEF and CDE is in ABCDEF, T2 is not actually a sub-tree of T1. The author's definition of sub-tree must have been different or it was just a mistake.

Related question: Determine if a binary tree is subtree of another binary tree using pre-order and in-order strings

edited May 23 '17 at 12:13

Community

1
1

answered Jan 20 '14 at 16:35

Daniel Imms

47,944
19
150
166

Why don't we care about post-order though? Also, your `BADCEF` counter example doesn't make sense..maybe I'm not seeing something – But I'm Not A Wrapper Class Jan 21 '14 at 00:41
1

What's your definition for subtree? – gen Nov 11 '16 at 16:29

Łukasz Kidziński · Answer 3 · 2014-01-20T16:44:54.443

Important assumption is that the tree has unique keys.

Now, note that preorder-traversal-string and inorder-traversal-string uniquely identify a binary tree.

Scatch of the proof:

Let T be a tree.

First object in preorder-traversal-string(T) is the root.
Find it in the in the inorder-traversal-string(T) - everything on left of that element is your left subtree L, let's call this substring inorder-traversal-string(L). Everything on right is your right subtree R.

Now, let's focus on the left subtree L.

Clearly all subtrees are separated (they don't mix) in both strings. They are represented as consecutive objects. The only problem is that a priori we don't know where preorder-traversal-string(L) ends in preorder-traversal-string(T).
Note that strings inorder-traversal-string(L) and preorder-traversal-string(L) have the same length. This gives as the place where to cut.
Now you have a subtree described as substrings inorder-traversal-string(L) and preorder-traversal-string(L) so you can repeat the procedure till the end.

Following those steps (inefficient but it is just for the proof) for all subtrees you will uniquely build the tree.

Thus, all subtrees of T1 are described uniquely by corresponding inorder-traversal-string and preorder-traversal-string.

Why is inorder and preorder traversal useful for creating an algorithm to decide if T2 is a subtree of T1

3 Answers3

Linked