doc2vec: any way to fetch closest matching terms for a given vector?

Question

The use-case I have is to have a collection of "upvoted" documents and "downvoted" documents and using those to re-order a set of results in a search.

I am using gensim doc2vec and am able to run the most_similar queries for word(s) and fetch matching words. But how would I be able to fetch the matching keywords given a vector fetched by a vector sum of the above doc vectors?

score 0 · Accepted Answer · answered Jan 15 '18 at 17:58

0

Ohh silly me, I found the answer staring right in my face, posting here in case anyone else has the issue:

similar_by_vector(vector, topn=10, restrict_vocab=None)

This is however found not in the Doc2Vec class, but in the KeyedVector class.

answered Jan 15 '18 at 17:58

Santino

776
2
11
29

2

Note that the `doc2vec_model.docvecs.most_similar()` also takes raw vectors, but you should be explicit that you are providing a list-of-positive-examples to avoid the vector-array from being misinterpreted as a positive-examples-array. Specifically, call like: `doc2vec_model.docvecs.most_similar(positive=[new_vector])`. – gojomo Jan 16 '18 at 22:15
Note that the `doc2vec_model.docvecs.most_similar()` also takes raw vectors, but you should be explicit that you are providing a list-of-positive-examples to avoid the vector-array from being misinterpreted as a positive-examples-array. Specifically, call like: `doc2vec_model.docvecs.most_similar(positive=[new_vector])`. – gojomo Jan 16 '18 at 22:15

doc2vec: any way to fetch closest matching terms for a given vector?

1 Answers1