1

I want to apply a hierarchical clustering method (i.e., agglomerative clustering) over different data sets. I would like to compare the resulted clustering trees. Is there any solution to this? Thanks in advance.

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
biborno
  • 175
  • 1
  • 11

1 Answers1

1

There are many ways to do this. I would suggest you to look at the "comparing two dendrograms" section in the vignette for dendextend:

https://cran.r-project.org/web/packages/dendextend/vignettes/introduction.html#comparing-two-dendrograms

Probably the simplest to use is the cor_cophenetic function.

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • what is the definition of cophenetic correlations for when comparing two dendograms? I could not find the definition and the logic behind cophenetic correlation. – biborno Oct 15 '18 at 19:54
  • You can check here: https://en.wikipedia.org/wiki/Cophenetic_correlation It is basically a similar distance matrix as the original one, so that when clustered it would give the same hierarchical clustering results as you got, but the distance of items from different branches is usually the height of the lowest common branch. This matrix is derived from both dendrograms, and the matching values (i.e.: of pairs of distances) are matched and a correlation (say, pearson) is calculated on them. This tells you how much two items are distant similarly in both trees. – Tal Galili Nov 03 '18 at 20:23