-2

I am looking to cluster short text documents, each a few hundred character long.

I have been using carrot2 workbench and I really like its capabilities but the API is really archaic and difficult to understand / use.

I am looking for a replacement that has similar capabilities (clustering algorithms) but with a better API.

I'm really looking for something in Java or Python and it has to be open source and free as in beer

So lingpipe (http://alias-i.com/lingpipe/) does not qualify.

Thanks.

user1172468
  • 5,306
  • 6
  • 35
  • 62

1 Answers1

1

scikit-learn is in Python, supports a wide range of machine learning algorithms (including clustering) and is very well documented.

mbatchkarov
  • 15,487
  • 9
  • 60
  • 79