-3

I need to calculate tf-idf for a set of documents and am looking for a java library that does this.

NOTE: I am aware of Mahout but I really want is a library with a simple interface and one that does not require infrastructure setup.

user1172468
  • 5,306
  • 6
  • 35
  • 62
  • The classes in Mahout are just simple calls to Lucene. They require no infrastructure, especially if you *read them* (they are open source). – bmargulies Jun 10 '13 at 01:16

1 Answers1

1

Mahout is easy to use and install. All you need is JDK environment and maven. how to install mahout

Also you could use hadoop with mahout, which is not a must (you could run mahout locally without hadoop). However you could find this blog helpful for install hadoop.

Freya Ren
  • 2,086
  • 6
  • 29
  • 39
  • can you help on this http://stackoverflow.com/questions/31837788/how-can-i-calculate-total-number-of-term-frequency-in-all-collection-s-document – user1 Aug 08 '15 at 05:19