I'm looking for an efficient way of creating a similarity vector of a single sentence against a list of sentences.
The trivial way of doing that is by iterating over the list of sentences and detect similarity between the single sentence and each one of the sentences in the list. This solution is too slow and I'm looking for a faster way of doing that.
My final goal is to detect if there is a really similar sentence in the list of sentences to the one I'm checking, if so I'll go to next sentence.
My solution right now is:
for single_sentence in list_of_sentences:
similarity_score = word2vec.sentences_similarity(sentence2test, single_sentence)
if similarity_score >= similarity_th:
ignore_sent_flag = True
break
list_of_sentences.append(sentence2test)
Iv'e tried to put 'list_of_sentences' in a dictionary/set but the improvement in terms of time is minor.
I came across this solution but it is based on a Linux only package so no relevant for me.