1

I was trying to do fuzzy k-means clustering with a dataset of dimension 15'000 x 7. I first tried the function fanny and it took R almost 7 hours to get a result (I also tried other parameters but it's always slow; with a sample of 5'000 rows it takes about half an hour). With the cmeans function it takes 27 seconds. What is cmeans doing different than fanny? Here is how I set up the two functions:

fn <- fanny(training, k=40, memb.exp=1.3, metric="manhattan")
cn <- cmeans(training, 40, iter.max=500, dist="manhattan", method="cmeans", m=1.3)

The resulting memberships are similar but not equivalent. In addition, how are the centers in cmeans computed? In fanny I use the following:

cent <- matrix(NA,40,ncol(training))
for (k in 1:40){
  cent[k,] <- colSums(fn$membership[,k]*training)/sum(fn$membership[,k]) 
}

Applying this to cmeans, I get different results than cmeans$centers.

Many thanks!

Vanessa
  • 33
  • 1
  • 6

0 Answers0