Determine if two binary trees are equal

Question

What would be the efficient algorithm to find if two given binary trees are equal - in structure and content?

Two nodes are equal if their data contents are the same and their left children are equal and their right children are equal. That's all you really need to know. — Hot Licks, Oct 12 '11 at 01:39

score 24 · Accepted Answer · 2012-10-16T19:03:22.593

It's a minor issue, but I'd adapt the earlier solution as follows...

eq(t1, t2) =
  t1.data=t2.data && eq(t1.left, t2.left) && eq(t1.right, t2.right)

The reason is that mismatches are likely to be common, and it is better to detect (and stop comparing) early - before recursing further. Of course, I'm assuming a short-circuit && operator here.

I'll also point out that this is glossing over some issues with handling structurally different trees correctly, and with ending the recursion. Basically, there need to be some null checks for t1.left etc. If one tree has a null .left but the other doesn't, you have found a structural difference. If both have null .left, there's no difference, but you have reached a leaf - don't recurse further. Only if both .left values are non-null do you recurse to check the subtree. The same applies, of course, for .right.

You could include checks for e.g. (t1.left == t2.left), but this only makes sense if subtrees can be physically shared (same data structure nodes) for the two trees. This check would be another way to avoid recursing where it is unnecessary - if t1.left and t2.left are the same physical node, you already know that those whole subtrees are identical.

A C implementation might be...

bool tree_compare (const node* t1, const node* t2)
{
  // Same node check - also handles both NULL case
  if (t1 == t2)  return true;

  // Gone past leaf on one side check
  if ((t1 == NULL) || (t2 == NULL))  return false;

  // Do data checks and recursion of tree
  return ((t1->data == t2->data) && tree_compare (t1->left,  t2->left )
                                 && tree_compare (t1->right, t2->right));
}

EDIT In response to a comment...

The running time for a full tree comparison using this is most simply stated as O(n) where n is kinda the size of a tree. If you're willing to accept a more complex bound you can get a smaller one such as O(minimum(n1, n2)) where n1 and n2 are the sizes of the trees.

The explanation is basically that the recursive call is only made (at most) once for each node in the left tree, and only made (at most) once for each node in the right tree. As the function itself (excluding recursions) only specifies at most a constant amount of work (there are no loops), the work including all recursive calls can only be as much as the size of the smaller tree times that constant.

You could analyse further to get a more complex but smaller bound using the idea of the intersection of the trees, but big O just gives an upper bound - not necessarily the lowest possible upper bound. It's probably not worthwhile doing that analysis unless you're trying to build a bigger algorithm/data structure with this as a component, and as a result you know that some property will always apply to those trees which may allow you a tighter bound for the larger algorithm.

One way to form a tigher bound is to consider the sets of paths to nodes in both trees. Each step is either an L (left subtree) or an R (right subtree). So the root is specified with an empty path. The right child of the left child of the root is "LR". Define a function "paths (T)" (mathematically - not part of the program) to represent the set of valid paths into a tree - one path for every node.

So we might have...

paths(t1) = { "", "L", "LR", "R", "RL" }
paths(t2) = { "", "L", "LL", "R", "RR" }

The same path specifications apply to both trees. And each recursion always follows the same left/right link for both trees. So the recursion visits the paths in the itersection of these sets, and the tightest bound we can specify using this is the cardinality of that intersection (still with the constant bound on work per recursive call).

For the tree structures above, we do recursions for the following paths...

paths(t1) intersection paths(t2) = { "", "L", "R" }

So our work in this case is bounded to at most three times the maximum cost of non-recursive work in the tree_compare function.

This is normally an unnecessary amount of detail, but clearly the intersection of the path-sets is at most as large as the number of nodes in the smallest original tree. And whether the n in O(n) refers to the number of nodes in one original tree or to the sum of the nodes in both, this is clearly no smaller than either the minimum or our intersection. Therefore O(n) isn't such a tight bound, but it's still a valid upper bound, even if we're a bit vague which size we're talking about.

@Steve314 when the first nodes of the trees are same, the algorithm returns true and the comparison stops ?(please correct me if i am wrong). — Sumit Gera, Feb 17 '13 at 07:38
@buzzinga - if the root nodes are both the same node, referenced through the same pointer value, then the algorithm *correctly* returns true and comparison *correctly* stops. If the root nodes are both the same node, then the trees are both the same tree. The values within the trees must be the same because they're the same tree. — , Feb 17 '13 at 11:35
@Steve314 now since the comparison stops, how will the rest of the nodes be compared? and if not compared, how will it be proved that two trees are isomorphic or not? — Sumit Gera, Feb 17 '13 at 11:47
@buzzinga - If you're worrying about the `if (t1 == t2) return true;` line, if the pointers `t1` and `t2` are equal, the (sub)trees don't need to be compared - if this ever finds a match, both nodes are the same node (it was the address of the node that matched), so both (sub)trees are the *same* (sub)tree. Every tree is isomorphic with itself and contains the same values as itself, because every tree is identical with itself. — , Feb 17 '13 at 23:34
@buzzinga - If you're worried about the `(t1->data == t2->data) && ...`, that's a shortcut boolean operator, but it's when the two nodes contain *different* data that the second argument isn't evaluated. `false && x` stops without evaluating `x`, giving the `false` result, because it doesn't matter whether `x` turns out to be `true` or `false`. `true && x` still needs to evaluate `x`, because the final result still depends on whether `x` is `true` or `false`. — , Feb 17 '13 at 23:36
@buzzinga - So the recursion and tree comparison stops when the first *mismatch* is found. If the two trees are isomorphic and contain the same data, but no subtree is shared (ie the same memory addresses), then the whole tree will be checked. Finding that two nodes are the same node stops early too, but only because if they're the same node they're both the root of the same subtree - you know they're identical already. — , Feb 17 '13 at 23:38

score 4 · Answer 2 · answered Sep 27 '09 at 05:23

4

Modulo stack overflow, something like

eq(t1, t2) =
    eq(t1.left, t2.left) && t1.data=t2.data && eq(t1.right, t2.right)

(This generalizes to an equality predicate for all tree-structured algebraic data types - for any piece of structured data, check if each of its sub-parts are equal to each of the other one's sub-parts.)

answered Sep 27 '09 at 05:23

Brian

117,631
17
236
300

Thank you Brian for giving apt explanation. – Rachel Sep 27 '09 at 05:36

score 3 · Answer 3 · edited Jan 05 '16 at 08:09

3

We can also do any of the two traversals (pre-order, post-order or in-order) and then compare the results of both the trees. If they are same, we can be sure of their equivalence.

edited Jan 05 '16 at 08:09

JTJM

255
1
6
18

answered Aug 31 '12 at 05:45

Pranav

31
1

It will also work but it's more costly , because you will have to travel each tree twice i.e. O(4n) + O(2n) to go over their traversals. Also if trees are big enough but have different sets of nodes in early levels, you will still have to traverse them till the end instead of breaking out. – Shrikant Prabhu Apr 07 '18 at 03:36

score 1 · Answer 4 · answered Sep 27 '09 at 05:07

1

A more general term for what you are probably trying to accomplish is graph isomorphism. There are some algorithms to do this on that page.

answered Sep 27 '09 at 05:07

Fragsworth

33,919
27
84
97

1

FWIW, tree isomorphism can be checked in linear time. – ShreevatsaR Sep 27 '09 at 05:10

score 1 · Answer 5 · answered Sep 17 '13 at 22:48

Since it's a proven fact that - it is possible to recreate a binary tree as long as we have the following:

The sequence of nodes that are encountered in an In-Order Traversal.
The sequence of nodes that are encountered in a Pre-Order OR Post-Order Traversal

If two binary trees have the same in-order and [pre-order OR post-order] sequence, then they should be equal both structurally and in terms of values.

Each traversal is an O(n) operation. The traversals are done 4 times in total and the results from the same-type of traversal is compared. O(n) * 4 + 2 => O(n) Hence, the total order of time-complexity would be O(n)

score 0 · Answer 6 · answered Dec 01 '20 at 06:19

bool identical(node* root1,node* root2){
    if(root1 == NULL && root2 == NULL)
        return true;
    if(root1==NULL && root2!=NULL || root1!=NULL && root2 == NULL)
        return false;
    if(root1->data == root2->data){
        bool lIdetical = identical(root1->left,root2->left);
        if(!lIdentical)
            return false;
        bool rIdentical = identical(root1->right,root2->identical);
        return lIdentical && rIdentical;
    }
    else{
        printf("data1:%d vs data2:%d",root1->data,root2->data);
        return false;
    }
}

I do not know if this is the most effecient but I think this works.

score 0 · Answer 7 · edited Jun 20 '20 at 09:12

0

I would write it as follows. The following code will work in most functional language, and even in python if your datatypes are hashable (e.g. not dictionaries or lists):

topological equality (same in structure, i.e. Tree(1,Tree(2,3))==Tree(Tree(2,3),1)):

tree1==tree2 means set(tree1.children)==set(tree2.children)
ordered equality:

tree1==tree2 means tree1.children==tree2.children

(Tree.children is an ordered list of children)

You don't need to handle the base cases (leaves), because equality has been defined for them already.

edited Jun 20 '20 at 09:12

Community

1
1

answered Oct 12 '11 at 01:32

ninjagecko

88,546
24
137
145

Could you elaborate how `set(tree1.children)==set(tree2.children)` would work on `Tree(0,Tree(1,Tree(2,3)))==Tree(0,Tree(Tree(1,3),2))`? – jfs Oct 17 '11 at 18:35
@J.F.Sebastian: it would work correctly (i.e. return the result "not equal") as follows: Is `0==0` and `(1,(2,3))==((1,3),2)`? Let's first check if `(1,(2,3))==((1,3),2)`. Is `1==(1,3)` and `(2,3)==2`? No, because a tree is not equal to an atom. Therefore it is not true that `(1,(2,3))==((1,3),2)`. Therefore it is not true that `(0,(1,(2,3)))==(0,((1,3),2))`. The idea is that set equality is recursively defined on the equality of its elements, and therefore an equality check will innately be recursive. – ninjagecko Oct 18 '11 at 21:04
why do you skip the `1==2` and `(2,3)==(1,3)` check? Does your `set()` definition implies a certain order for elements? Otherwise it is [very expensive](http://www.wolframalpha.com/input/?i=a%5Bn%5D+%3D%3D+1+%2B+2+a%5Bn+-+1%5D%2C+a%5B0%5D+%3D%3D+1) as an implementation. It might be ok as a mere definition (not implying any implementation) though the word *topological* implies *for me* that nodes values are not important and only the shape of the trees matters e.g., [to a topologist, a coffee cup and a donut are the same thing](http://www.youtube.com/watch?v=4iHjt2Ovqag) – jfs Oct 18 '11 at 22:11

Determine if two binary trees are equal

7 Answers7

Linked

Related