I have a (60000, 60000) sparse matrix of cosine distances? Are there any frameworks or in-built scipy instruments to perform a hierarchical clustering on that data? In similar question author has observation matrix rather than distance.
Asked
Active
Viewed 299 times
0
-
Is this useful? http://dev.bizo.com/2012/01/clustering-of-sparse-data-using-python.html – Max Pierini May 19 '21 at 18:06
-
If your distance matrix is sparse that means most of your observations are a distance of 0 from each other. I'm not sure hierarchical clustering makes sense for your data. – CJR May 19 '21 at 21:15
-
@MaxPierini unfortunately not very, I am looking for hierarchical clustering, and sklearn does not support one for sparse matrixes – fiendfire28 May 20 '21 at 05:14
-
@CJR Not necessary, sparse format works when most of observations are similar. It is usually zeros, but not in my case – fiendfire28 May 20 '21 at 05:15
-
Please, add the code to reproduce a sample of your matrix – Max Pierini May 20 '21 at 06:16
-
While true, scipy sparse matrices define unstored values as 0 by convention. I don't think there's an easy way around that for clustering. – CJR May 20 '21 at 13:03