0

I was just wondering if there is a way in cmeans function [in package e1071] to perform the clustering using the Mahalanobis distance?

Many thanks

ERE
  • 1
  • 1
  • Do you have your answer or are you looking for something else? – cdeterman Oct 07 '14 at 21:52
  • Thank you very much for the prompt reply. I had tried using the mahalanobis dist into the fanny function. However, I was not sure if these two functions perform similar custering to the data and what is the difference between the membership exponent in fanny and the m fuzzification in cmeans. Cheers – ERE Oct 10 '14 at 08:20

1 Answers1

2

The e1071 package does not have a mahalanobis option. However, you can look into the cluster package and the fanny function. As per the help page, it also computes a fuzzy clustering of the data into k-clusters. With this function, you can provide your own distance matrix.

So for mahalanobis distance, you can calculate your distance matrix with dist and then run your clustering.

require(cluster)
set.seed(123)
x<-rbind(matrix(rnorm(100,sd=0.3),ncol=2),
         matrix(rnorm(100,mean=1,sd=0.3),ncol=2))
y <- dist(x, "mahalanobis")
fanny(y, k=2)

Given your understandable concerns over equivalence between the functions here is an example comparing them:

require(e1071)
cl<-cmeans(x,centers=2,iter.max=20,dist="euclidean",method="cmeans",m=2)
fl <- fanny(x, k=2, maxit=20, metric="SqEuclidean", memb.exp=2)

> head(cl$membership)
             1           2
[1,] 0.9948729 0.005127121
[2,] 0.3647778 0.635222221
[3,] 0.9290126 0.070987385
[4,] 0.7588260 0.241174043
[5,] 0.9282550 0.071745007
[6,] 0.9599231 0.040076886
> head(fl$membership)
          [,1]        [,2]
[1,] 0.9948722 0.005127775
[2,] 0.3647890 0.635211040
[3,] 0.9290171 0.070982905
[4,] 0.7588304 0.241169649
[5,] 0.9282575 0.071742489
[6,] 0.9599221 0.040077878

Although not absolutely identical, you can see there are very close. You will also notice that fanny is specifying the squared euclidean distance which is what cmeans is doing. This equivalence is noted on the fanny help page ?fanny under metric.

cdeterman
  • 19,630
  • 7
  • 76
  • 100