Concatenating two doc2vec models: Vector dimensions doubled

Question

I have a question regarding concatenating two doc2vec models. I followed the official gensim IMDB example on doc2vec and implemented example data.

When concatenating two models (PV-DM + PV-DBOW), as outlined in the original paper, I wondered that the concatenated model appears not to have 200-dim, like the two input models, but 400-dim:

Shape Train(11948, **400**)
Shape Test(2987, **400**)

The input shapes were each:

np.asarray(X_train).shape)
(11948, **200**)
(2987, **200**)

Is this correct? I expected the number of dimensions to be 200 again.

score 1 · Answer 1 · answered Feb 08 '18 at 13:21

1

This is correct. PV-DM and PV-DBOW are two different models, each producing different embeddings of dimension dim, where dim=200 in your case. Hence, when concatenating the dimension should double.

answered Feb 08 '18 at 13:21

geompalik

1,582
11
22

Concatenating two doc2vec models: Vector dimensions doubled

1 Answers1