I have understood how LSA works when the similarity between words is calculated. I am using LSA from the website lsa.colorado.edu, but I cannot find a source how the similarity between sentences or multiple words is calculated. Is it just done by averaging over all pairwise similarities?
Asked
Active
Viewed 1,295 times
2 Answers
1
You can combine word vectors simply by summing them together and returning the final summation as sentence vector. Since these representations have the same type as word representations, you can easily use existing methods for computing the Semantic Similarity.
Then to compute semantic similarity you can use the cosine value between those vectors.
I'm currently using the S-Space library and It has a DocumentVectorBuilder class that perform this task.

German Attanasio
- 22,217
- 7
- 47
- 63
0
You use what is called Dot product to calculate the cosine similarity between two vectors. So, once you get the SVD matrix from your term-document frequency matrix, you then apply dot product formula between two vectors.

user5047207
- 1
- 1