7

Possible Duplicate:
Determine if two binary trees are equal

Got an interview yesterday, a question got me, here it is:

Description

There are 2 binary trees, check if they are equal.

They are equal if and only if tree1->child == tree2->child, and one tree's left and right children can be swapped with each other.

For example:

    5     6
   / \   / \           they are equal.
   1 2   2  1

    5         6
   / \       / \           they are equal.
  1   2     2   1
 /     \   /    / 
3       4 4     3

Any ideas are appreciated.

Community
  • 1
  • 1
Alcott
  • 17,905
  • 32
  • 116
  • 173
  • Are they asking you to test a definition of equality? Then show that it is transitive, reflexive, and symmetric. –  Oct 12 '11 at 00:26
  • @PengOne, well, I don't think the `link you gave` is a duplicate, in my question, a tree's left/right children `can be swapped` and then check equality with the other tree. – Alcott Oct 12 '11 at 00:33
  • @bdares, I don't quite get it, could you plz be more specific? – Alcott Oct 12 '11 at 00:34

6 Answers6

9

Equality operators are transitive: If A=B, and B=C, then A=B=C so A=C.

Equality operators are reflexive: A=A, B=B, and C=C no matter what their values are.

Equality operators are symmetric. If A=B, then B=A. (It doesn't matter what order they are in.)

Now, taking a look at the definition they gave you:

A tree is equal to another tree if the children are equal. Let's see. We can assume that the nodes are being compared at the bottom, or else the definition is pretty useless. But they don't bother to tell you how to resolve that comparison, and the whole definition they gave you hinges on it.

In short, it's a crappy question.

Let's see what happens if we decide we want to try and unravel the question, though.

But wait, they also tell you that the two children of any tree can be swapped. This adds the constraint that any tree that is equal to anything else (including itself) must be equal to its mirror image. And any variations of its subtrees' children being swapped.

And remember that this is supposed to be a search tree. Therefore, we can probably assume that two different search trees that are processed by the same algorithm must give the same result if they are equal. So, if we switch around the elements of a tree, then the search time would be affected. So, trees that do not have every node in place are not equal to each other.

Putting that together with the "swappable" property of this equality, we can see it's not a valid definition of equality. (If we try to apply it, then it turns out that only trees that have the same node for every node at a particular level are equal, and only to themselves, which breaks the reflexivity part of an equality operator.)

  • @Alcott: this answer is not saying that your question was bad. It is saying that it's a crappy interview question because the definition of equality is invalid. Unless it was a trick question, I would not want to work at this company. – ninjagecko Oct 12 '11 at 01:07
  • @ninjagecko, well, does my drawing make the definition of `equality` clearer? – Alcott Oct 12 '11 at 01:09
  • @Alcott yes I meant the interview question was bad, not your question. –  Oct 12 '11 at 01:09
  • @Alcott: yes, also `tree1->child` and `tree2->child` is not well-defined. That is incorrect phrasing on their part, because a node does not have just one child. – ninjagecko Oct 12 '11 at 01:15
  • @ninjagecko, `tree->child` means, it could be left child or right child, cuz you can swap the subtree – Alcott Oct 12 '11 at 01:35
  • @Alcott: you can only swap the subtree *during equality testing*; your claim that the child doesn't matter would mean that all nodes at all levels must be equal – ninjagecko Oct 12 '11 at 03:33
  • btw, programming is not math, so `A=A` is not always true e.g., `nan != nan`. – jfs Oct 15 '11 at 15:32
8

I don't think this is an unreasonable question. A simple recursive solution is

boolean equals(x, y)
{
  if (x == null)
  {
    return y == null;
  }
  if (y == null)
  {
    return false;
  }
  if (x.val != y.val)
  {
    return false;
  }
  if (equals(x.left, y.left) && equals(x.right, y.right))
  {
    return true;
  }
  if (equals(x.left, y.right) && equals(x.right, y.left))
  {
    return true;
  }
  return false;
}

This can be very expensive, for example in the case when we have two large trees of similar shape where all the non-leaf nodes have the same associated value and the leaf nodes of one are a permutation of the leaf nodes of another.

To get past this you could first of all change left and right as required so that left < right, for some recursive definition of <. This could also be expensive, but much less so than checking every permutation, and I think a choice of definition of < would help. This would then allow you to check for equality with an ordinary definition.

This notion of http://en.wikipedia.org/wiki/Canonicalization followed by ordinary equality also resolves questions about whether you really do have an equivalence relation. An equivalence relation is equivalent to a partition. Ordinary equality is obviously a partition. If you compare x and y by comparing f(x) and f(y) followed by an equivalence relation you have a partition of x and y, and therefore an equivalence relation.

Thinking more about this, I think the way to make either canonicalisation or equality-testing reasonably efficient is to work from the bottom up, annotating each node with a token whose value reflects the result of comparisons with other nodes, so that you can compare nodes, and the subtrees below them, just be comparing tokens.

So the first step for equality is to e.g. use a hash table to annotate each leaf with tokens which are equal only when the values at the leaves are equal. Then, for nodes whose only children are leaves, use e.g. a hash table to assign further tokens so that the tokens in those nodes are equal only when the leaves, if any, beneath those nodes match. Then you can go one more step up, and this time you can compare tokens at child nodes instead of recursing down the tree there. The cost of assigning tokens in this way should be linear in the size of the trees involved. At the top you can compare trees just by comparing the tokens at the root.

mcdowella
  • 19,301
  • 2
  • 19
  • 25
  • I did give an recursive solution which is pretty much the same as yours, but the interviewer said that was too expensive. – Alcott Oct 12 '11 at 06:10
  • 1
    This is an important point that you should of added to your question – Louis Ricci Oct 12 '11 at 11:32
  • (Edited to add a bottom-up approach which I think is now reasonably efficient, and so answers your interviewer's objection). – mcdowella Oct 12 '11 at 17:19
  • I've implemented in Python solution that uses [canonization approach](http://stackoverflow.com/questions/7733986/2-binary-trees-are-equal-or-not/7780341#7780341) that you suggested. – jfs Oct 15 '11 at 20:11
3

If you implement their definition of "equality" with flip-invariance, you will violate the definition of equality. The definition doesn't even make sense, because that's not how binary search trees are equal (unless each node has a pointer to which subtree is "greater" and which is "lesser").

You have two choices of reasonable definitions:

  1. topological (flip-agnostic) equivalence (in which case you can't call it a "binary search tree" because it's not sorted):

    tree1==tree2 means set(tree1.children)==set(tree2.children)

  2. normal search tree (flip-caring) equivalence:

    tree1==tree2 means list(tree1.children)==list(tree2.children)

For binary trees, the above definitions will work as-written in any language which supports the list and set datatypes (python sets will choke however on unhashable datatypes). Nevertheless, below are some more verbose and ugly C/Java-like definitions:

  1. topological equivalence:

    t1==t2 means (t1.left==t2.left and t1.right==t2.right) or (t1.left==t2.right and t1.right==t2.left)

  2. sorted tree equivalence:

    t1==t2 means (t1.left==t2.left and t1.right==t2.right)

The definitions above are recursive; that is, they assume equality has been defined for the subtrees and base-cases already, which it has.


sidenote:

quote: tree1->child == tree2->child

This is not a valid statement, because a tree node does not have a single child.

Community
  • 1
  • 1
ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • about your sidenote: could be valid, if you're treating `child` as a collection of node objects... I once implemented a tree using a bunch of Map and List objects recursively, and the children of each node were stored as a list, which was mapped to by the node by the string "child". –  Oct 12 '11 at 02:27
  • @bdares: yes agree, but in which case I would call that `.children` (as I did in the answer) – ninjagecko Oct 12 '11 at 03:30
  • @ninjagecko: Did you mean *"because that's **not** how binary search trees are equal"* ? – Rob Oct 12 '11 at 03:45
  • @robjb: yes, thank you for finding that typo. fixed. – ninjagecko Oct 12 '11 at 04:25
1

Compare trees using canonization approach suggested by @mcdowella. The difference is that my approach doesn't require O(N) additional memory w.r.t number of nodes in the tree:

# in Python
from collections import namedtuple
from itertools import chain

# Tree is either None or a tuple of its value and left, right trees
Tree = namedtuple('Tree', 'value left right')

def canonorder(a, b):
    """Sort nodes a, b by their values.

    `None` goes to the left
    """
    if (a and b and a.value > b.value) or b is None:
        a, b = b, a # swap
    return a, b

def canonwalk(tree, canonorder=canonorder):
    """Yield all tree nodes in a canonical order.

    Bottom-up, smaller children first, None is the smallest
    """
    if tree is not None:
        children = tree[1:]
        if all(t is None for t in children): return # cut None leaves
        children = canonorder(*children)            
        for child in chain(*map(canonwalk, children)):
            yield child
    yield tree 

canonwalk() requires O(N*M) steps and O(log(N)*M) memory to yield all nodes in a tree, where N is total number of nodes, M number of children each node has (it is 2 for binary trees).

canonorder() could be easily generalized for any node representation and any number of children. canonwalk() requires only that a tree can access its immediate children as a sequence.

The comparison function that calls canonwalk():

from itertools import imap, izip_longest

unset = object() 
def cmptree(*trees):
    unequal = False # allow root nodes to be unequal
    # traverse in parallel all trees under comparison
    for nodes in izip_longest(*imap(canonwalk, trees), fillvalue=unset):
        if unequal:
            return False # children nodes are not equal
        if any(t is unset for t in nodes):
            return False # different number of nodes
        if all(t is not None for t in nodes):
            unequal = any(nodes[-1].value != t.value for t in nodes)
        else: # some are None
            unequal = any(t is not None for t in nodes)
    return True # equal

Example

    5         6
   / \       / \           they are equal.
  1   2     2   1
 /     \   /    / 
3       4 4     3

tree1 = Tree(5, 
             Tree(1, 
                  Tree(3, None,None), None), 
             Tree(2, 
                  None, Tree(4, None, None)))
tree2 = Tree(6, 
             Tree(2, Tree(4, None, None), None),
             Tree(1, Tree(3, None, None), None))
print cmptree(tree1, tree2)

Output

True
Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

I read the questions as: given two binary trees, for each depth in the tree, find out whether their children's set are covered in each other.

This can be coded relatively easy.

Kunukn
  • 2,136
  • 16
  • 16
  • -1. That is not what the question is stating, and this doesn't even attempt to answer either interpretation of the question. – Rob Oct 12 '11 at 14:49
0

Solution without recursion in Ruby

def same? top_t1, top_t2
  for_chek << [top_t1, top_t2]   # (1) put task for check into queue

  while t1,t2 = for_check.shift  # (2)
    return false unless t1.children.count == t2.children.count  # generally for non-binary tree, but also needed for controlling of nil children
    break if t1.children.empty?

    t1_children = t1.children.sort # this is sorted arrays
    t2_children = t2.children.sort # of childrens      
    return false unless t1_children == t2_children  # (3)

    0.upto(t1_children.count - 1) do |i|
      for_check << [t1_children[i], t2_children[i]]  # put equivalent child pairs into queue
    end
  end
  return true
end

Ruby syntax tips:

  • (1) putting element into array: arr << elem; in this case for_check is array of arrays
  • (2) parallel assignment: t1,t2 = [item1, item2]. Same as arr = [item1, item2]; t1 = arr[0]; t2 = arr[1]
  • (3) t1_children == t2_children assumed corresponding behavior of == for this kind of objects. More verbose will be t1_children.map { |el| el.val } == t2_children.map { |el| el.val } - here map produces array of vals.
x'ES
  • 554
  • 5
  • 15