3

After running this

nlp = spacy.load('en_core_web_lg')
has_vector = nlp('test text').has_vector
# ...
...has_vector == True

But after running this

nlp = spacy.load('en_core_web_trf')
has_vector = nlp('test text').has_vector
# ...
...has_vector == False

What am I missing?

Vy Do
  • 46,709
  • 59
  • 215
  • 313
VikR
  • 4,818
  • 8
  • 51
  • 96
  • 1
    You asked this on the spaCy forums too, but the Transformer pipelines don't include word vectors because you generally don't need them if you have Transformers. https://github.com/explosion/spaCy/discussions/9076 – polm23 Aug 28 '21 at 05:11
  • 1
    If you'd like to post this as an answer, I will mark it as the correct answer. – VikR Aug 28 '21 at 16:28

2 Answers2

2

has_vector refers to word vectors specifically, and not contextual vectors generated by Transformers. Since if you're using Transformers you generally don't need word vectors, the spaCy Transformers pipeline doesn't include word vectors, which is why you get this result.

polm23
  • 14,456
  • 7
  • 35
  • 59
1

As per the docs here (https://spacy.io/usage/embeddings-transformers) though, you can get the vector using

nlp('test text')._.trf_data.tensors[-1]
JASON G PETERSON
  • 2,193
  • 1
  • 18
  • 19