As of v2.1, spaCy has a BERT-style language model (LM). It predicts word-vectors instead of words, so I am going to use "words" and "word vectors" interchangeably here.
I need to take a sentence with a word masked, and a list of words, and rank the words by how likely they are to appear in the masked slot. Currently I am using BERT for this (similar to bert-syntax). I would like to see if spaCy's performance on this task is acceptable. Between this file and this one I'm pretty sure it's possible to build something. However, it feels like reaching deeper into the internals of the library than I'd like.
Is there a straightforward way to interact with spaCy's masked language model?