I need to aggregate some tree into cluster of "similar" tree, but actually i do not know how to define distance between two different tree. For the clustering algorith, my first bet is on k-mean but i am not sure about my choice.
I need to evaluate both topological difference (between trees) and data distance (each node contain a value so two trees that have the same structure can have different values, so they are considered different).
My question is very close to that : Clustering tree structured data
But i do not want to cluster stack trace but a real tree, what i am not able to do is to write a distance function that take in account both layout and content of each node. I am not asking which distance function is good for my scenario, but which is the right pattern to address that goal.