I am working in R, with the dendextend
package, trying to compare hclusts
objects with cop_cophenetic
.
I have two objects that rise from clustering: clusts
and clusts1
, and I want to compare the cophenetic correlation between them. I have a few options as below:
cor_cophenetic(as.phylo(clusts), as.phylo(clusts1))
[1] 0.1632751
cor_cophenetic(as.dendrogram(clusts), as.dendrogram(clusts1))
[1] 0.1632751
cor_cophenetic(clusts, clusts1)
[1] 0.689649
cor_cophenetic(as.phylo.hclust(clusts), as.phylo.hclust(clusts1))
[1] 0.1632751
I can also try a more direct approach with base R
cor(as.vector(cophenetic(clusts)), as.vector(cophenetic(clusts1)))
[1] 0.689649
First, I don't understand the difference between calling cor_cophenetic
on the hclusts
objects, to calling cor_cophenetic
on the dendrograms, or phylos. Is there a correct way here?
Next, I try to do a randomization test on the labels of clusts1
.
per <- sample(length(clusts1$labels))
clusts1$labels <- clusts1$labels[per]
While the cophenetic on the dendros vary on the randomizations (I get a distribution). The direct cophenetic on the hclusts
stays fixed (0.689649) - and does not change. Why is it?