Why Gensim doc2vec give AttributeError: 'list' object has no attribute 'words'?

Question

I am trying to experiment gensim doc2vec, by using following code. As far as I understand from tutorials, it should work. However it gives AttributeError: 'list' object has no attribute 'words'.

from gensim.models.doc2vec import LabeledSentence, Doc2Vec
document = LabeledSentence(words=['some', 'words', 'here'], tags=['SENT_1']) 
model = Doc2Vec(document, size = 100, window = 300, min_count = 10, workers=4)

So what did I do wrong? Any help please. Thank you. I am using python 3.5 and gensim 0.12.4

`LabeledSentence` got depricated https://medium.com/@gofortargets/doc2vec-word2vec-in-gensim-c9321c780079 — Ufos, Jul 27 '18 at 16:44

score 4 · Accepted Answer · answered Apr 14 '16 at 08:23

4

Input to gensim.models.doc2vec should be an iterator over the LabeledSentence (say a list object). Try:

model = Doc2Vec([document], size = 100, window = 1, min_count = 1, workers=1)

I have reduced the window size, and min_count so that they make sense for the given input. Also go through this nice tutorial on Doc2Vec, if you haven't already.

answered Apr 14 '16 at 08:23

kampta

4,748
5
31
51

Thanks for helping. but I got this error. OverflowError: Python int too large to convert to C long. do you know why? Thanks. – W.S. Apr 15 '16 at 09:28
At which step are you getting this error? Can you post your error trace? – kampta Apr 15 '16 at 09:40
I think it was below: File "C:\Anaconda3\envs\sandbox\lib\site-packages\gensim\models\word2vec.py", line 944, in seeded_vector once = random.RandomState(uint32(self.hashfxn(seed_string))) OverflowError: Python int too large to convert to C long – W.S. Apr 15 '16 at 09:44
Is your input same as the one in the question? Please provide a MCVE (http://stackoverflow.com/help/mcve) – kampta Apr 15 '16 at 12:34
Yes. from gensim.models.doc2vec import LabeledSentence, Doc2Vec document = LabeledSentence(words=['some', 'words', 'here'], tags=['SENT_1']) model = Doc2Vec([document], size = 100, window = 1, min_count = 1, workers=1) – W.S. Apr 15 '16 at 20:30
Alright, I wasn't able to reproduce the error, however, https://www.kaggle.com/c/word2vec-nlp-tutorial/forums/t/11197/gensim-word2vec-cython-on-windows/68017#post68017 - seem to have a solution. Define your own hash function pass that for training; `def hash32(value): return hash(value) & 0xffffffff; model = Doc2Vec([document], size = 100, window = 1, min_count = 1, workers=1, hashfxn=hash32)` – kampta Apr 16 '16 at 02:56

Why Gensim doc2vec give AttributeError: 'list' object has no attribute 'words'?

1 Answers1

Linked