3

How to define hash function for a tree structure which depends only on the structure of the tree, and is same irrespective of node labels?

For example, 2--1--3--4 should have same hash function as 1--4--2--3 or 4--1--3--2.

  • 2
    I don't see any trees here. Are you talking about binary trees or trees with arbitrary many children per node? – Henry Jun 14 '17 at 09:05
  • Define a 'normalizing' function that brings all trees of a similar structure to one representation. Then you can hash this representation. – Lyth Jun 14 '17 at 09:22
  • 1
    see also : https://math.stackexchange.com/questions/1604260/algorithm-for-equality-of-trees-of-restricted-depth – gilleain Jun 14 '17 at 10:38
  • Henry: I meant arbitrary tree. Sorry for mentioning binary tree as an example. @Lyth: Can you give an example of such normalizing function? – Pranjal Jain Jun 14 '17 at 11:28

2 Answers2

2

Find the centre of the tree. Then run a recursive algorithm as such from the centre:

recurse(u, p):

    hash = INITH
    vector childrenhash = {}

    for each (u,v) in G:
        if v!=p:
            childrenhash.insert(recurse(v,u))

    childrenhash.sort()

    for elem in childrenhash:
        hash = (hash * (elem xor PR)) % MOD

    return hash

Choose some appropriate values for INITH, MOD and PR.

Two isomorphic trees will have the same hash.

four_lines
  • 503
  • 3
  • 12
  • 1
    What is the definition of centre? – Petar Petrovic Jun 14 '17 at 10:41
  • @PetarPetrovic Oh sorry, centre being the point(s, it could be two) in the middle of the longest path (diameter) of the tree. – four_lines Jun 14 '17 at 11:45
  • @four_lines Won't the hash depend upon the centre chosen (in case of two centers)? – Pranjal Jain Jun 19 '17 at 06:07
  • @PranjalJain Just maintain consistency in your calculation, i.e, if you're calculating hashes of various trees, in case of trees with two centres, calculate hashes starting from both centres and choose the minimum value all the time (or the maximum, it's up to you). – four_lines Jun 20 '17 at 06:18
2

If you throw out node labels, what is left is the number of children for each node. So you could just count the number of children for each node and write them all in one string( array, vector, ...).

Example:

   a             2       
  / \           / \      
 b   c    =>   0   2     =>  2,0,2,0,0
    / \           / \
   d   e         0   0

Now, suppose you're saying, the following trees should be considered equal:

   a              a      
 / | \          / | \    
b  c  d        b  c  d   
  / \  \      / \    |   
 d   e  f    d   e   f  

You can add more transformation step to the same idea: sort the children:

    a                 3                   3       
  / | \             / | \               / | \     
 b  c  d     =>    0  2  1      =>     0  1   2     =>  3,0,1,2,0,0,0
   / \  \            / \  \               |  / \   
  d   e  f          0   0  0              0 0   0 


      a                 3                 3        
    / | \             / | \             / | \      
   b  c  d   =>      2  0  1    =>     0  1   2     =>  3,0,1,2,0,0,0
  / \    |          / \    |              |  / \   
 d   e   f         0   0   0              0 0   0  

I'm probably following idea in the link by @gilleain: https://math.stackexchange.com/questions/1604260/algorithm-for-equality-of-trees-of-restricted-depth

Lyth
  • 2,171
  • 2
  • 29
  • 37
  • Ordering by the number of child-nodes is a nice plan,but you'll need a tie-breaker (based on the childrem's children) -> recursion – wildplasser Jun 14 '17 at 12:28
  • This works for rooted trees. How'll you check that following trees are identical? See [this](http://imgur.com/lV2XPqw) and [this](http://imgur.com/F9H6vPq). These give, [2,0,0] and [1,1,0] respectively. But they are structurally same. – Pranjal Jain Jun 15 '17 at 07:55
  • @PranjalJain are you talking about tree isomorphism - https://stackoverflow.com/a/742698/949044 ? Seems like I missed that point. – Lyth Jun 16 '17 at 09:43