-1

i have calculated the tf-idf values of terms of document 1 and document 2..now i dont know how to use these tf-idf values...basically i want to find similarity between two documents(in my case are webpages)..can any body tell how to implement cosine similarity, jaccard coefficient to find similarity...c# code would be appreciated..pls help...thanks

jaskirat
  • 39
  • 1
  • 7

1 Answers1

0

I recommend a visit to Apache Mahout. It provides a complete kit of tools for this. Even if you don't want to use them, you can get the answers to these questions by looking at existing implementations.

bmargulies
  • 97,814
  • 39
  • 186
  • 310