0

I have produced a dendrogram in R through hierarchical clustering analysis. I have 310 individuals that have been classified into 1 of 3 groups (my cut off, k, looks to be 3) based on 4 criteria. I have plotted the dendrogram, with the labels I want. But I am hoping to extract the results into a table which will be easier for me to use for further statistical work. I have manually gone through the small text on my dendrogram, but have found an error in my work, so I would like R to create the table for me to verify my work.

I have tried a few options from other websites, and from one entry on stackflow, but have not been successful. I would ideally want the data extraction to provide an output in this format:

columns[Individual ID, clustering group label (1-3)] #with all the results below for my 310 individuals

Here is what I have tried:

eaf.order <- matrix(data=NA, ncol=2, nrow=nrow(residency2), dimnames=list(c(), c("row.num", "row.name")))
leaf.order[,2] <- hc.complete2$labels[hc.complete2$order]

Which gives error:

Error in leaf.order[, 2] <- hc.complete2$labels[hc.complete2$order] : number of items to replace is not a multiple of replacement length

Werner Hertzog
  • 2,002
  • 3
  • 24
  • 36
S Jav
  • 1
  • 1
  • Can you share some code of what you have tried? – mtoto Mar 26 '16 at 17:30
  • I was using this link: http://stackoverflow.com/questions/10088117/exporting-dendrogram-as-table-in-r – S Jav Mar 26 '16 at 18:10
  • But i get an error after the 2nd line of code: – S Jav Mar 26 '16 at 18:10
  • Sorry, I keep pressing enter. > leaf.order <- matrix(data=NA, ncol=2, nrow=nrow(residency2), + dimnames=list(c(), c("row.num", "row.name"))) > leaf.order[,2] <- hc.complete2$labels[hc.complete2$order] Error in leaf.order[, 2] <- hc.complete2$labels[hc.complete2$order] : number of items to replace is not a multiple of replacement length – S Jav Mar 26 '16 at 18:11
  • see [here](http://stackoverflow.com/questions/12826552/exporting-hclust-cluster-membership) – mtoto Mar 26 '16 at 18:12
  • Yes!!! That worked amazingly, thank you so much!!! Saved me a world of headache :) – S Jav Mar 26 '16 at 18:41
  • I'd say you are going about it, well, in a roundabout manner. IF what you really need is to cluster your data into k groups, you should be using k-means/k-medoids clustering. See `pam` in the `cluster` package - it also includes the handy silhouette statistic and ways to plot it to help select your k, and it's easy to pull cluster membership from the pam object (use str to figure it out). – user3554004 Mar 26 '16 at 22:51

0 Answers0