0

I wonder which algorithm is the best for semantic similarity? Can anyone explain why?

Thank you!

1 Answers1

3

Semantic similarity of what - words,, phrases, sentences, paragraphs, documents, other? And 'best' with respect to what end goal?

The original paper which defined 'Word Mover's Distance', "From Word Embeddings To Document Distances", gave some examples of where WMD works well, and comparisons of its behavior against other similarity-calculations.

But, WMD is far more expensive to calculate, especially on longer texts. And as a method which uses every word's presence, regardless of ordering, it still isn't strong in cases where tiny grammatical changes – such as the addition of a 'not' in the right place – might completely reverse a text's meaning to human readers. (But then again, quick-and-simple comparisons like the cosine-similarity between two bag-of-words representations, or between two average-of-word-vectors representations, aren't strong there either.)

gojomo
  • 52,260
  • 14
  • 86
  • 115