The above picture is from Distributed Representations of Sentences and Documents, the paper introducing Doc2Vec. I am using Gensim's implementation of Word2Vec and Doc2Vec, which are great, but I am looking for clarity on a few issues.
- For a given doc2vec model
dvm
, what isdvm.docvecs
? My impression is that it is the averaged or concatenated vector that includes all of the word embedding and the paragraph vector,d
. Is this correct, or is it d? - Supposing
dvm.docvecs
is notd
, can one access d by itself? How? - As a bonus, how is
d
calculated? The paper only says:
In our Paragraph Vector framework (see Figure 2), every paragraph is mapped to a unique vector, represented by a column in matrix D and every word is also mapped to a unique vector, represented by a column in matrix W.
Thanks for any leads!