-2

I have a very large amount of data in the form of matrix.I have already clustered it using k-means clustering in MATLAB R2013a. I want the exact coordinates of the centroid of each cluster formed.. Is it possible using any formula or anything else?

I want to find out the centroid of each cluster so that whenever some new data arrives in matrix, i can compute its distance from each centroid so as to find out the cluster to which new data will belong

My data is heterogeneous in nature.So,its difficult to find out average of data of each cluster.So, i am trying to write some code for printing the centroid location automatically.

AlessioX
  • 3,167
  • 6
  • 24
  • 40
  • Post the code you used for kmeans, the centroids are an output of the MATLAB function... – Dan Feb 12 '16 at 09:48
  • Use the **documentarion**... but if your data is "heterogenous in nature", k-means may fail to produce a meaningful result. Carefully study the result! **It can be 'optimal' in the k-means sense, yet useless and biased** at the same time. – Has QUIT--Anony-Mousse Mar 10 '16 at 06:57

2 Answers2

1

In MATLAB, use

[idx,C] = kmeans(..) 

instead of

idx = kmeans(..) 

As per the documentation:

[idx,C] = kmeans(..) returns the k cluster centroid locations in the k-by-p matrix C.

Dan
  • 45,079
  • 17
  • 88
  • 157
prashanth
  • 4,197
  • 4
  • 25
  • 42
0

The centroid is simply evaluated as the average value of all the points' coordinates that are assigned to that cluster.

If you have the assignments {point;cluster} you can easily evaluate the centroid: let's say you have a given cluster with n points assigned to it and these points are a1,a2,...,an. You can evaluate the centroid for such cluster by using:

centroid=(a1+a2+...+an)/n

Obviously you can run this process in a loop, depending on how your data structure (i.e. the assignment point/centroid) is organized.

AlessioX
  • 3,167
  • 6
  • 24
  • 40
  • 2
    But why would you do this when the cluster centroid locations are just the second output of MATLAB's [`kmeans`](http://www.mathworks.com/help/stats/kmeans.html?refresh=true) already... – Dan Feb 12 '16 at 09:49
  • 1
    The OP didn't specified what toolbox/function has used and he's looking for a formula to evaluate the centroids. This is definition of centroid and the formula. If he/she's been using the Matlab built-in `kmeans` function, than you're right, he/she can use the given output instead. – AlessioX Feb 12 '16 at 09:51