I'm working on a Seq2Seq model to perform abstractive summarization using the Glove pre-trained word embeddings. Is it required I make two embedding matrices? One that covers the source vocabulary and one that covers the summary vocabulary.
Asked
Active
Viewed 40 times
1 Answers
0
No, the common practice is to share the embedding matrices even in machine translation where the words are from different languages.
Sometimes, the embedding matrix is also used as an output projection matrix when generating model output (see e.g., the Attention is all you need paper), however, this is only possible if you use a vocabulary of tens of thousands of (sub)word as opposed to the very large vocabulary of GloVe.

Jindřich
- 10,270
- 2
- 23
- 44
-
Thanks for clarifying this Jindřich. I would upvote your answer but I don't have the reputation yet. Thanks again – eliboy8 Jul 07 '21 at 16:01