1

The slides of my professor compare the "Neural Net Language Model" (Bengio et al., 2003) with Google's word2vec (Mikolov et al., 2013). It says that, differently from the Bengio's model, in word2vec "the projection layer is shared (not just the weight matrix)"

What does it mean? Shared across what?

The other differences are that there is no hidden layer in Mikolov's model, and that the context contains words from both the past and the future (while only words from the past are accounted for in Bengio's model).

I understood these latter differences, but I have difficulty in understading the "shared layer" concept.

robertspierre
  • 3,218
  • 2
  • 31
  • 46

0 Answers0