Identify the dimensions in doc2vec model

Question

I have created a doc2vec model of size of 100 dimensions. From what I understand from my reading that these dimensions are features of my model. How can I identify what these dimensions are exactly.

score 0 · Answer 1 · answered Sep 11 '17 at 17:51

The 'Paragraph Vectors' algorithms behind Doc2Vec simply gives documents vectors that are interesting in their distance/directional arrangement in comparison to other co-trained document vectors.

The individual dimensions don't have specific interpretable meanings. As with Word2Vec, there may be 'neighborhoods' of related items, and certain directions may vaguely map to understandable concepts.

But those directions aren't directly aligned with the individual perpendicular dimensions of the coordinate space. And there's nothing in the process that helps you describe those directional tendencies. (They tend to come up if differencing vectors, as in the analogy-solving problems.)

You can see an example in the 'Document Embedding With Paragraph Vectors' paper, Table 2, where Japanese pop artists who are (perhaps) similar to 'Lady Gaga' are discovered by shifting in space in the directions of -'American'+'Japanese'. That is, there's no one dimension that Japanese-vs-American – but there is a directional tendency across all dimensions.

Identify the dimensions in doc2vec model

1 Answers1