-1

I am trying to get similar words to an origin word using spacy. I've imported the en_core_web_md and en_core_web_lg models and I am using the word vectors to find words that are similar.

When I run this outside of my virtual environment, there is no issue; however, inside venv, when I run:

dog = nlp.vocab[u'dog']
for x in dog.vocab:
    print(x.text)

it produces results such as:

Nothin
5
Nuthin’
ll
:)
'y
this
-_-
(ಠ_ಠ)
’em

which are not even remotely similar to 'dog.' When I try with other words such as 'lively' or 'banana', it gives exactly the same results as dog.

gmdev
  • 2,725
  • 2
  • 13
  • 28

1 Answers1

1

It looks like you might have two different versions of spacy, probably v2.2 in one venv and v2.3 in the other.

Instead of iterating over nlp.vocab, iterate over nlp.vocab.vectors instead, which should work in any version spacy v2 and avoid lexemes that don't have vectors:

for orth in nlp.vocab.vectors:
    w = nlp.vocab[orth]
    # do something

See the section on preloaded vocab in the v2.3 migration guide here: https://spacy.io/usage/v2-3#migrating

aab
  • 10,858
  • 22
  • 38