i have calculated the tf-idf values of terms of document 1 and document 2..now i dont know how to use these tf-idf values...basically i want to find similarity between two documents(in my case are webpages)..can any body tell how to implement cosine similarity, jaccard coefficient to find similarity...c# code would be appreciated..pls help...thanks
Asked
Active
Viewed 718 times
1 Answers
0
I recommend a visit to Apache Mahout. It provides a complete kit of tools for this. Even if you don't want to use them, you can get the answers to these questions by looking at existing implementations.

bmargulies
- 97,814
- 39
- 186
- 310