Passing distance matrix to k-means clustering in sklearn

Question

As per as the sklearn kmeans documentation, it says that k-means requires a matrix of shape=(n_samples, n_features). But I provided a distance matrix of shape=(n_samples,n_samples) where each index holds the distance between two strings. The time series has been converted into strings using the SAX representation.

When I ran the clustering with the distance matrix, it gives good result. What can be the possible reason for this? As far as I know, K-medoids is the one which works with distance matrix.

score 6 · Answer 1 · answered Apr 20 '17 at 21:53

6

K-means, as the name indicates, uses means.

Computing the arithmetic mean requires access to the original features, a distance matrix cannot be used.

K-means also does not use pairwise distances. So the distance matrix is useless for this algorithm.

Choose a different algorithm instead, such as hierarchical clustering.

answered Apr 20 '17 at 21:53

Has QUIT--Anony-Mousse

76,138
12
138
194

What algorithm supports this? – narzero Jun 05 '19 at 09:44

Passing distance matrix to k-means clustering in sklearn

1 Answers1

Linked