9

Alright guys, I was asked this question in an interview today and it goes like this:

"Tell if one binary tree is contained inside another binary tree or not (contains implies both in structure and value of nodes)"

I thought of the following approach:

Flatten the larger tree as:

{{{-}a{-}}b{{-}c{-}}}d{{{-}e{{-}f{-}}}g{{{-}h{-}}i{{-}j{-}}}}

(I did actually write code for this, {-} implies empty left or right sub-tree, each sub-tree is enclosed within {} paranthesis)

Now for smaller sub-tree we need to match this pattern:

{{.*}e{.*}}g{{{.*}h{.*}}i{{.*}j{.*}}}

where {.*} denotes an empty or non-empty sub-tree.

At the time I thought, this will be a trivial regex pattern matching problem in java but I am bamboozled. Actually now I feel, I have just transformed the problem (created one monster out of another).

Is there a simple regex one liner to match these patterns? I understand there might be other approaches to solve this problem and this might not be the best one. I just wonder if this is solvable.

VJune
  • 1,195
  • 5
  • 16
  • 26
  • 1
    Does "in structure" mean "the same object" or ".equals() [with appropriate implementation]? E.g., if tree one is a leaf with value "4", and tree two also has a leaf with value "4" (but which is a different object than tree one), does tree two contain tree one? – Joshua Taylor Aug 19 '13 at 17:21
  • 1
    I don't see a requirement in the question asked initially to use regular expressions. Was this part of the interview question? Reg-exes really seem like the wrong tool entirely for this job. – Dark Falcon Aug 19 '13 at 17:23
  • Along with @DarkFalcon I suspect that an algorithm that *must* traverse the entirety of both trees might not be what the interviewers were hoping for. After all, after looking at the top few nodes of two trees, you can determine which subtrees possibly have overlap, and which don't. Even if you do want to use string presentations of the trees, as long as your delimiters are balanced, can't you do this just by checking whether the string of the possibly contained tree is a substring of the possibly containing tree? – Joshua Taylor Aug 19 '13 at 17:24
  • 2
    @JoshuaTaylor - How would "looking at the top few nodes" work? There's nothing about the trees being sorted in any way. – Ted Hopp Aug 19 '13 at 17:25
  • @TedHopp Oh, very good point! I'd assumed binary search trees, but you're absolutely right, this isn't mentioned in the question. Ok, _if_ it's a binary search tree, then there can be some solutions that don't require traversing the whole tree. Good catch! – Joshua Taylor Aug 19 '13 at 17:26
  • I assume that "contained inside" doesn't mean "subtree of"? So, for instance, the tree consisting of a single node having A is contained inside the three-node tree having A at the root and children B and C? – Ted Hopp Aug 19 '13 at 17:32
  • @TedHopp Though the problem is definitely under-specified, that seems like a strange interpretation of "contained inside" to me. After all, "the tree consisting of a single node having A" is "a node whose value is A, left subtree is the empty tree, and whose right subtree is the empty tree", and the "three-node tree having A at the root and children B and C" doesn't have a node like that. However, interpreting it where the empty tree in the possible member is interpreted as a wildcard in the possible containing tree is certainly another interesting exercise. – Joshua Taylor Aug 19 '13 at 17:37
  • @TedHopp: your interpretation is correct. In other words, we can cut branches off from the larger tree in a way to exactly obtain the smaller one. At least this is what I understood from the problem specification. – VJune Aug 19 '13 at 17:51
  • @aryan In that case, then neither of the answers so far (Joowani's or mine) are suitable; they only address finding subtrees of a tree. The problem you're describing may be rather more difficult. – Joshua Taylor Aug 19 '13 at 18:03
  • @JoshuaTaylor - It's basically the labelled subgraph isomorphism problem restricted to graphs that are binary trees. – Ted Hopp Aug 19 '13 at 18:08
  • @TedHopp Indeed, for general subgraph isomorphism, there's no polynomial algorithm known; I don't know whether there is for the special case of binary trees. This interpretation would make this a surprisingly difficult interview question. (Though if someone answered it, they'd probably have something to offer, to be sure.) – Joshua Taylor Aug 19 '13 at 18:13

4 Answers4

1

I'm not sure what the interviewer meant exactly by "contained inside another binary tree". If the interviewer was asking for a method to check whether A was a subtree of B, here is one method that does not require regex at all:

  • Flatten the trees A and B using preorder traversal to get strings, say, preA and preB
  • Flatten the trees A and B using inorder traversal to get strings, say, inA and inB
  • Make sure to include the null leaves in the strings as well (using whitespaces for example)
  • Check if preA is a substring of preB AND inA is a substring of inB

The reason you wanna include the null leaves is because when multiple nodes have the same value, the preorder and inorder may not be enough. Here is an example:

          A
      A       A
   B     B       C
 C         D       B
D           C       D 

You can also check this out:

checking subtrees using preorder and inorder strings

Also read this for more info on preorder and inorder traversals of binary trees:

http://en.wikipedia.org/wiki/Tree_traversal

Now, if he DIDN'T mean just subtrees, the problem may become more complicated depending on what the interviewer meant by a "part". You could look at the question as a subgraph isomorphism problem (trees are just a subset of graphs) and this is NP-complete.

http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem

There may be better approaches since trees are much more restricted and simpler than graphs.

Community
  • 1
  • 1
Joohwan
  • 2,374
  • 1
  • 19
  • 30
  • 1
    This only works to detect if one tree is a subtree of another, not to detect whether one tree is "contained inside" another (possibly not as a subtree). – Ted Hopp Aug 19 '13 at 17:32
  • Can't this be done in just one preorder traversal? E.g., if you generate the lisp-like string where `(value )` is the node whose value is `value` and whose left and right subtrees' strings are `` and ``, isn't the single substring check sufficient? – Joshua Taylor Aug 19 '13 at 17:34
  • @JoshuaTaylor - You need both checks. Read the thread that Joowani links to for examples of why. – Ted Hopp Aug 19 '13 at 17:36
  • @TedHopp The examples there don't include any punctuation, as I did in my comment. With punctuation, the tree that has `B` as its root and `A` as left child is `(B A nil)`, and the tree that has `A` as its root and `B` as right child is `(A nil B)`. With punctuation, I don't think that two traversals are necessary. (Note that this could still be done with the in-order traversal, since `(A B nil)` and `(nil A B)` are distinct too.) – Joshua Taylor Aug 19 '13 at 17:40
  • I don't think it works for parts of trees. Consider the one-node tree containing A; the preorder string will be "A.." (using "." to represent a null leaf). This tree is contained in any tree that has an A node. However, the three-node tree with A as a root and children B and C has the preorder traversal string "AB..C.." and this does not contain the substring "A..". – Ted Hopp Aug 19 '13 at 18:44
  • Yes you are right Ted it only works on subtrees. I will clarify that in my answer. – Joohwan Aug 19 '13 at 20:19
0

You can do this using a substring check as described in the other answers, and using just one traversal (pre-order, in-order, or post-order), so long as you are printing the entirity of each node in the tree, not just their values. For instance, a binary tree is either

  • the empty tree, which we will print as null, or
  • a value and two trees, which we print as (value left-tree right-tree), where left-tree and right-tree are replaced by the representation of the left and right subtrees.

Each tree now has an unambiguous printed representation, and so a tree T is a subtree of a tree S if and only if the string representation of T is a substring of the string representation of S.

For instance, the tree

    A
   / \
  B   C
 / \
D   E

is represented as

(A (B (D null null) (E null null)) (C null null))

and you can check that the subtrees of this tree have strings

(A (B (D null null) (E null null)) (C null null))
(B (D null null) (E null null))
(D null null)
(E null null)
(C null null)

each of which is a substring of the string for the whole tree.

The only caveats of course are cases where the string representations of the values interfere with the serialization of the trees (e.g., the value strings contain spaces, or parenthesis, &c.), so to make this robust, you'd want to take appropriate measures with delimiters or escapes.

Also note that not every string that is a substring of a tree corresponds to a substring of a tree. For instance, the string null) (E is a substring of the tree, but does not correspond to a subtree of the tree; only when a string s is the representation of a tree t does it mean that if s is a substring of the string s' of a tree t' that t is a subtree of t'.

Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
0

Strictly speaking, regex is not equipped to deal with nested brackets. Nesting can be matched using recursive regular expressions, but Java's regex engine does not support recursive expressions. In PERL or PHP, you could use a pattern something like

{(?:(?R)|-)}\w{(?:(?R)|-)}

to match some tree structure, but you would still not be able to specify the values of child nodes at specific levels.

So, there is unfortunately no easy one line of regex that will solve this problem. Regex is not the tool you need for this job.

In order to answer this question, I would recommend constructing your large tree and small tree, then calling largeTree.contains(smallTree) using the following class:

public class BTreeNode
{

public String value;
public BTreeNode left;
public BTreeNode right;

public bool contains(BTreeNode tree)
{
  bool retVal = visit(tree, this);

  if (!retVal && left != null)
    retVal = left.contains(tree.left);

  if (!retVal && right != null)
    retVal = right.contains(tree.right);

  return retVal;
}

private bool visit(BTreeNode small, BTreeNode large)
{
  bool retVal;

  if (small == null)
  {
    retVal = true;
  }
  else if (small.value.equals(large.value))
  {
    retVal = visit(small.left, large.left) && visit(small.right, large.right);
  }
  else
  {
    retVal = false;
  }

  return retVal;
}

}

Worst case, a traversal of the small tree will be performed for each node of the large tree, which is O(m * log n) where m is the size of the large tree and n is the size of the small tree. worst case can be achieved when every element of both the large and small tree are equal, and the small tree is actually one node larger than the large tree.

Luke Willis
  • 8,429
  • 4
  • 46
  • 79
0

Ex:- Please see the attached binary Tree Example T1 and T2.

enter image description here

Please note that this problem is not same as sub tree problem. The binaryTree T2 is contained inside binary tree T1 if T2 is present structurally exact same layout in T1 and T1 may contain extra structure [ which does not matter ].

I have solved this problem with below code. Its a bit too involved to explain but please understand it by debugging. The way you call the function is at this line below.Here tree1 is T1 , tree2 is T2 and size2 is size of T2 tree.

return(containsInternal(tree1.getRoot(), tree2.getRoot(), size2));

// This function checks if the node1 is contained with in another binary tree with starting point of node2 [ which means node1->m_data==node2->m_data has been verified ].
// This is not same as subtree problem. Read the code carefully.
bool checkContains(const BinaryTreeNode<int>* node1, const BinaryTreeNode<int>* node2, long& iterations)
{
  if(iterations<0)
    return(true);
  if(!node1)
    return(true);

  bool returnStatus1=true, returnStatus2=false, returnStatus3=true;
  if(node1->m_leftChild)
  {
    if(!node2->m_leftChild)
      return(false);
    else
      returnStatus1=checkContains(node1->m_leftChild, node2->m_leftChild, iterations);
  }
  //cout<<"Attempting to compare "<<node1->m_data<<" and "<<node2->m_data<<" with iterations left = "<<iterations<<endl;
  if(node1->m_data==node2->m_data)
  {
    returnStatus2=true;
    --iterations;
  }

  if(node1->m_rightChild)
  {
    if(!node2->m_rightChild)
      return(false);
    else
      returnStatus3=checkContains(node1->m_rightChild, node2->m_rightChild, iterations);
  }
  return(returnStatus1&&returnStatus2&&returnStatus3);
}

// Iterate tree starting at node1 in in order traversal and if node matches node of tree2 then start doing contains checking further.
bool containsInternal(const BinaryTreeNode<int>* node1, const BinaryTreeNode<int>* node2, long size)
{
  if(!node1||!node2)
    return(false);
  bool result1=containsInternal(node1->m_leftChild, node2, size);
  bool result2=false;
  if(node1->m_data==node2->m_data)
    result2=checkContains(node2, node1, size);  // Note : node2 is passed first argument since checkContains traverses structure of BT of first argument.size is size of tree of node2.
  bool result3=containsInternal(node1->m_rightChild, node2, size);
  return(result1||result2||result3);
}
// Checks if the tree2 is a part of the tree1.
bool contains(BinaryTree<int>& tree1, BinaryTree<int>& tree2)
{
  size_t size1=tree1.size();
  size_t size2=tree2.size();
  if(!size2)
    return(true); // null tree is always contained in another tree. 
  if(size2>size1)
    return(false);  // The tree2 can not be inside tree1 if it is bigger in size.
  return(containsInternal(tree1.getRoot(), tree2.getRoot(), size2));
}
Anand Kulkarni
  • 395
  • 1
  • 2
  • 10