16

I am using Scipy for hierarchial clustering. I do manage to get flat clusters on a threshold using fcluster. But I need to visualize the dendrogram formed. When I use the dendrogram method, it works fine for 5-6k user vectors. But my dataser consists of 16k user vectors. When I run it for 16k users dendrogram function throws the following error:

File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2333, in _dendrogram_calculate_info
leaf_label_func, i, labels)
File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2205, in _append_singleton_leaf_node
ivl.append(str(int(i)))
RuntimeError: maximum recursion depth exceeded while getting the str of an object

Any ideas on visualizing dendrogram for larger dataser?

Maxwell
  • 409
  • 1
  • 6
  • 19
  • A simple idea is to extend your memory, otherwise you may need to dive into the implementation detail to make the routine memory friendly. – xiao 啸 May 06 '12 at 23:40
  • I had the same thing happen to me, but only when clustering was done with some methods (single, average, complete), but not ward. I wonder what triggers this - what are the properties of the same size linkage matrices that makes the recursion go so deep? – user1603472 Feb 28 '17 at 14:19

2 Answers2

29

This may be a bit late, but if you feel comfortable with increasing your recursion limit to subvert the recursion depth limit, you could do so. It's not recommended, and definitely not 'pythonic', but it will likely get you the results you want.

import sys
sys.setrecursionlimit(10000)
VedTopkar
  • 421
  • 5
  • 8
1

Using sys.setrecursionlimit(1000000) I was able to process a large matrix and successfully return a seaborn.clustermap call. I imagine that this error could also be possibly resolved by upgrading scipy or supplying additional arguments and building a clustermap more thoughtfully using scipy.