1

So I'm trying to use a pretrained Doc2vec for my semantic search project. I tried with this one https://github.com/jhlau/doc2vec (English Wikipedia DBOW) and with the forked version of Gensim (0.12.4) and python 2.7 It works fine when I use most_similar but when i try to use infer_vector I get this error: AttributeError: 'Doc2Vec' object has no attribute 'neg_labels' what can i do to make this work?

1 Answers1

1

For reasons given in this other answer, I'd recommend against using a many-years-old custom fork of Gensim, and also find those particular pre-trained models a little fishy in their sizes to actually contain all the purported per-article vectors.

But also: that error resembles a very-old bug which only showed up if Gensim was not fully installed to have the necessary Cython-optimized routines for fast training/inference operations. (That caused some older, seldom-run code to be run that had a dependency on the missing neg_labels. Newer versions of Gensim have eliminated that slow code-path entirely.)

My comment on an old Gensim issue has more details, and a workaround that might help - but really, the much better thing to do for quality results & speedy code is to use a current Gensim, & train your own model.

gojomo
  • 52,260
  • 14
  • 86
  • 115
  • Thank you, this was very helpful. If i don't use these pre-trained models and this fork of Gensim but can't for my usecase train my own model, where can i find a more recent pretrained doc2vec model? – ahmed belgacem Jun 28 '21 at 09:23
  • I'm not familiar with other public pretrained models as of right now (June 2021), sorry. – gojomo Jun 28 '21 at 17:38