I am trying to learn how to use Elmo embeddings via this tutorial:
https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md
I am specifically trying to use the interactive mode as described like this:
$ ipython
> from allennlp.commands.elmo import ElmoEmbedder
> elmo = ElmoEmbedder()
> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]
> vectors = elmo.embed_sentence(tokens)
> assert(len(vectors) == 3) # one for each layer in the ELMo output
> assert(len(vectors[0]) == len(tokens)) # the vector elements
correspond with the input tokens
> import scipy
> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for",
"breakfast"])
> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine
distance between "apple" and "carrot" in the last layer
0.18020617961883545
My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?
I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.
My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!
Thanks for your patience as I am pretty new in all this.