0

I have a question regarding concatenating two doc2vec models. I followed the official gensim IMDB example on doc2vec and implemented example data.

When concatenating two models (PV-DM + PV-DBOW), as outlined in the original paper, I wondered that the concatenated model appears not to have 200-dim, like the two input models, but 400-dim:

Shape Train(11948, **400**)
Shape Test(2987, **400**)

The input shapes were each:

np.asarray(X_train).shape)
(11948, **200**)
(2987, **200**)

Is this correct? I expected the number of dimensions to be 200 again.

Christopher
  • 2,120
  • 7
  • 31
  • 58

1 Answers1

1

This is correct. PV-DM and PV-DBOW are two different models, each producing different embeddings of dimension dim, where dim=200 in your case. Hence, when concatenating the dimension should double.

geompalik
  • 1,582
  • 11
  • 22