2

I use the k-medoids algorithm pam to do clustering based on the (symmetric) distance matrix, tmp, below:

if(!require("cluster")) { install.packages("cluster");  require("cluster") } 
tmp <- matrix(tmp <- matrix(c( 0,  20,  20,  20,  40,  60,  60,  60, 100, 120, 120, 120,
             20,   0,  20,  20,  60,  80,  40,  80, 120, 100, 140, 120,
             20,  20,   0,  20,  60,  80,  80,  80, 120, 140, 140,  80,
             20,  20,  20,   0,  60,  80,  80,  80, 120, 140, 140, 140,
             40,  60,  60,  60,   0,  20,  20,  20,  60,  80,  80,  80,
             60,  80,  80,  80,  20,   0,  20,  20,  40,  60,  60,  60,
             60,  40,  80,  80,  20,  20,   0,  20,  60,  80,  80,  80,
             60,  80,  80,  80,  20,  20,  20,   0,  60,  80,  80,  80,
             100, 120, 120, 120,  60,  40,  60,  60,   0,  20,  20,  20,
             120, 100, 140, 140,  80,  60,  80,  80,  20,   0,  20,  20,
             120, 140, 140, 140,  80,  60,  80,  80,  20,  20,   0,  20,
             120, 120,  80, 140,  80,  60,  80,  80,  20,  20,  20,   0),
             nr=12, dimnames=list(LETTERS[1:12], LETTERS[1:12]))
tmp_pam <- pam(as.dist(tmp, diag = TRUE, upper = TRUE) , k=3)
tmp_pam$clusinfo # get cluster info
tmp_pam$silinfo # get silhouette information
clusplot(tmp_pam)

I have read here that clusplot uses cmdscale and princomp, which makes sense. However, the order of the operations is not given.

How can I get the Component1 and Component2 coordinates, along with their cluster labels and point id's from the output of clusplot? I want to have access to these in order to modify / plot them in ggplot.

I can guess the plotting is somehow related to the silhouette information but do not quite understand how we get to the final plot below:

enter image description here

Zhubarb
  • 11,432
  • 18
  • 75
  • 114
  • 2
    You can access the code of the function (i.e. `edit(cluster:::clusplot.default)`) –  May 12 '15 at 14:31
  • @Pascal, thank you - I was just looking at the code actually. It looks painfully long. I may be better off taking a guess at how to implement it. – Zhubarb May 12 '15 at 14:33
  • Whatever happened to this query? Did you ever manage to find a way to obtain the coordinates you were after? I have the exact same problem, where I want to identify one specific point in a `fviz_cluster` plot. – FaCoffee Oct 09 '18 at 14:39

1 Answers1

0

According to the documentation, clusplot uses either

  • Principal Component Analysis
  • Multidimensional Scaling

to project your data. Probably depends on whether you passed a data or a distance matrix.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • I feed in a distance matrix (`tmp` above). I think what it does is: 1- find the medoid centres, 2- Project to 2-D via cmdscale and princomp , 3- Place cluster members around each cluster centre (which I am not sure how to do). It is just a shame if it does not give access to the coordinates and other plot elements and force the reinvention of the wheel. – Zhubarb May 13 '15 at 06:47