0

I have a (60000, 60000) sparse matrix of cosine distances? Are there any frameworks or in-built scipy instruments to perform a hierarchical clustering on that data? In similar question author has observation matrix rather than distance.

fiendfire28
  • 161
  • 2
  • 10
  • Is this useful? http://dev.bizo.com/2012/01/clustering-of-sparse-data-using-python.html – Max Pierini May 19 '21 at 18:06
  • If your distance matrix is sparse that means most of your observations are a distance of 0 from each other. I'm not sure hierarchical clustering makes sense for your data. – CJR May 19 '21 at 21:15
  • @MaxPierini unfortunately not very, I am looking for hierarchical clustering, and sklearn does not support one for sparse matrixes – fiendfire28 May 20 '21 at 05:14
  • @CJR Not necessary, sparse format works when most of observations are similar. It is usually zeros, but not in my case – fiendfire28 May 20 '21 at 05:15
  • Please, add the code to reproduce a sample of your matrix – Max Pierini May 20 '21 at 06:16
  • While true, scipy sparse matrices define unstored values as 0 by convention. I don't think there's an easy way around that for clustering. – CJR May 20 '21 at 13:03

0 Answers0