Gensim 4.0.0, with many fixes & performance improvements, has also changed some property/method names for simplicity & long-term consistency. A project wiki page has a guide to adapting older code to match the new APIs:
https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4
But, your code doesn't need to use .vocab
at all. The set of word-vectors, w2v_model.wv
, can answer whether a key is in
itself already. So the following code should work both pre-4.0 and in 4.0-and-above:
if word in self.w2v_model.wv:
vector = self.w2v_model.wv[word]
else:
vector = [0] * 100
(Separately, if you did choose to keep using an older Gensim to put-off any other code changes, it's be better to use 3.8.3
, the last in the 3.x
series, released May 2020, rather than an older/buggier 3.8.1
, released in September 2019. But some key word2vec-related operations will be faster and use less memory in gensim-4.0.0
& higher, so rolling-back should be avoided if possible.)