0

I have a set of 10 points and need to find the clustering given 2 distinct cluster centers (8,3) & (8,-1)

As the starting point. If I do it manually I get the correct cluster centers (4,5) & (4,-1).

If I use R kmeans I get the centers for the right and left clusters (10.4,2) & (-2.4,3.2).

My R code is:

x = c(-6,4,-3,7,1,6,-4,0,0,-1,11,7,8,3,8,-1,13,3,12,-2)  
xx = matrix(x,nrow=2) # 2 x 10 matrix  
xx  
mx = t(xx) # transpose to 10 x 2 matrix  
mx  
kcenters = matrix(c(8,8,3,-1),ncol=2)  
kcenters  
km = NULL  
km <- kmeans(mx, centers=kcenters, iter.max=1)  
km$centers  

I found this answer R k-means algorithm custom centers but that doesn't seem to work for me either.

Any suggestions as to what I am doing wrong?
Thanks

Jakub
  • 13,712
  • 17
  • 82
  • 139
  • Can you show/explain how you did it manually? – Alex P Oct 28 '17 at 20:00
  • 1. draw a line between the 2 distinct original clusters (8,3) &8,-1). – user3757860 Nov 08 '17 at 13:46
  • Let's try this again. 1. draw a line between the 2 distinct original clusters (8,3) & (8,-1). 2. Draw a line perpendicular to the 1st line at its midpoint (8,1). 3. find the center of all the points on one side of the line and the center for the other side. In this case (-4,0), (0,-1), (8,-1), & (12,-2) are below the line. The rest are above the line. The new centers are (4,5) & (4,-1) 4. Since all the points are on the same side of the line as before, you are done. – user3757860 Nov 08 '17 at 14:12
  • There are faster parallel libraries that perform Lloyd's algorithm like `knor`. `install.packages("knor"); require(knor); km <- Kmeans(mx, kcenters)` – quine Oct 30 '18 at 02:22

1 Answers1

0

The problem is that the default value of algorithm is "Hartigan-Wong", but you are probably using "Lloyd". If you change your kmeans statement to

km <- kmeans(mx, centers=kcenters, algorithm="Lloyd")

you will get the answer that you are expecting. There is a detailed explanation of the differences between the algorithms on the Data Science Forum

G5W
  • 36,531
  • 10
  • 47
  • 80