I need to calculate tf-idf for a set of documents and am looking for a java library that does this.
NOTE: I am aware of Mahout but I really want is a library with a simple interface and one that does not require infrastructure setup.
I need to calculate tf-idf for a set of documents and am looking for a java library that does this.
NOTE: I am aware of Mahout but I really want is a library with a simple interface and one that does not require infrastructure setup.
Mahout is easy to use and install. All you need is JDK environment and maven. how to install mahout
Also you could use hadoop with mahout, which is not a must (you could run mahout locally without hadoop). However you could find this blog helpful for install hadoop.