Is there any rule, when I like to find cosine similarity between two documents that have different number of words?
Asked
Active
Viewed 298 times
1 Answers
2
The standard formula does not require the number of words to match. You can just sum over the union of the words of both documents. All words that are in B but not in A give rise to a 0 in the word vector for A. All words that are in A but not in B give rise to a 0 in the word vector for B.

Udo Klein
- 6,784
- 1
- 36
- 61