2

I am building the vocabulary table using Doc2vec, but there is an error "AttributeError: module 'gensim.utils' has no attribute 'smart_open'". How do I solve this?

This is for a notebook on Databricks platform, running in Python 3. In the past, I've tried on running the code on a local Jupyter Notebook but the same error occurred.

I've also searched https://radimrehurek.com/gensim/models/doc2vec.html but could not find anything related to smart_open.

model = Doc2Vec(window=5, min_count=1, size=50, sample=1e-5, negative=5, workers=1)

model.build_vocab(sentences.to_array())

I ran the above lines separately. The first line worked fine. The second says: AttributeError: module 'gensim.utils' has no attribute 'smart_open'

  • Can you show the full error stack shown, so that it's clear what line(s), in both your shown code and library code being called, are involved in the error? (And, what is your `sentences` and why is it being converted `to_array()`? Typical corpuses for `Word2Vec` don't need to be a raw array – any re-iterable sequence, with each item being a list-of-words, would work.) – gojomo Jul 24 '19 at 01:12

2 Answers2

5

I believe this is because you installed a new gensim version, then you will get this error. You can either:

(1) update the call as this following suggestion: /python3.7/site-packages/smart_open/smart_open_lib.py:398: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function

OR: (2) pip install gensim==3.4.0

Hope this helps.

sonvx
  • 411
  • 4
  • 7
1

I tried the first method given in the above answer, which unluckily doesn't apply to my case. And by looking into the file "utils.py", I find that the statement about smart_open is "from smart_open import open". Thus, I tried to use "utils.open()" instead, which exactly solves my problem ^-^

By the way, my gensim version is 3.8.3.

F. C.
  • 36
  • 3