I want to create a dendrogram using an index (proportion data) that will show similar clusters. I am trying to decide what distance/similarity metric I have to use so that they represent the original index values.
I have a data frame that looks like this:
data<-read.table(text="ind index
T1 0.10
T2 0.11
T3 0.01
T4 0.64
T5 0.03
T6 0.15
T7 0.26
T8 0.06
T9 0.01
T10 0.004
T11 0.01
T12 0.19
T13 0.04
T14 0.69
T15 0.06
T16 0.51
T17 0.15
T18 0.26
T19 0.26
T20 0.01
",header=T)
head(data)
data2<-as.matrix(data[,2])
d<-dist(data2)
# prepare hierarchical cluster
hc = hclust(d)
# very simple dendrogram
plot(hc)
This will produce a simple dendrogram. However, I actually want to use the values from the index column as "my distance". Any suggestions are welcome. Thanks in advance!