Comparison among ELMo, BERT, and GloVe

Question

What are the differences among ELMo, BERT, and GloVe in word representation? How differently do they perform word embedding tasks? Which one is better and what advantages and disadvantages does each have in comparison with others?

score -1 · Answer 1 · answered Oct 16 '21 at 03:07

-1

This is a big question.

I will concentrate into Word Representation.

ELMo, BERT and GloVe can be divided into 2 big group. GloVe is Non-contextual Word Embedding and ElMo, BERT are in Contextual Word Embeddings.

And the second group can be divided into Uni-directional model (ELMo) and Bi-directional model(BERT).

Firstly, we can try to understand 4 terms : Non-contextual/ Contextual Word Embedding and Uni/Bi-directional model.

Afterward, we can go deeper other differences.

answered Oct 16 '21 at 03:07

Tan Phan

337
1
4
14

1

ELMo is not a unidirectional model. Please see the original paper [Deep contextualized word representations](https://arxiv.org/pdf/1802.05365.pdf) by Peters et al., particularly Sections 3.1 and 3.2, or [this tutorial](https://jalammar.github.io/illustrated-bert/) for an illustrated depiction of how it works. ELMo combines the intermediate layers of both a forward and backward Language Model (LM), specifically forward and backward LSTMs. – Kyle F Hartzenberg Oct 27 '22 at 06:18

Comparison among ELMo, BERT, and GloVe

1 Answers1