19

I'm working on project using Word2vec and gensim,

model = gensim.models.Word2Vec(
    documents = 'userDataFile.txt',
    size=150,
    window=10,
    min_count=2,
    workers=10)
model = gensim.model.Word2Vec.load("word2vec.model")
model.train(documents, total_examples=len(documents), epochs=10)
model.save("word2vec.model")

this is the part code that I have at the moment, and I'm getting this error below

Traceback (most recent call last):
File "C:\Users\User\Desktop\InstaSubProject\templates\HashtagData.py", line

37, in <module>
workers=10)
TypeError: __init__() got an unexpected keyword argument 'documents'

UserDataFile.txt is the file that I stored output result data that I got from web scraping.

I'm not really sure what I need to fix here.

Thank you in advance !

dubooduboo
  • 233
  • 2
  • 3
  • 7

4 Answers4

86

The year is 2021 and if you're here for the same reason I am, it's because you're getting the same error on the size parameter.

You need to use vector_size instead.

Major Major
  • 2,697
  • 3
  • 27
  • 35
  • 3
    I could swear that 3 days ago I was running a word2vec model with size, and today I've had to upvote your comment for how spot on it was. edit: could it be because 3 days ago I was running said model on Python 3.8 and now I'm doing it on a VM that has Python 3.6? – Dkoded May 24 '21 at 19:31
  • 4
    i am having the same issue with param `iter` – Sunil Garg May 31 '21 at 07:26
  • 2
    @SunilGarg I don't think there is an `iter` parameter for Word2vec. There is `epochs` but I'm not sure if that's what you want. – Major Major Jun 01 '21 at 18:47
  • 8
    @SunilGarg I've experienced the same issue with `iter`. Heres what the documentation says: ```epochs (int, optional) – Number of iterations (epochs) over the corpus. (Formerly: iter)``` – Pysnek313 Jun 03 '21 at 12:47
  • Released Mar. 31 2021 3.x -> 4 https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4 – Pysnek313 Jun 03 '21 at 13:15
5

Use vector_size instead of sizestrong text

# creating a word to vector model
model_w2v = gensim.models.Word2Vec(
            tokenize_data,
            vector_size=200)
Biman Pal
  • 391
  • 4
  • 9
1

__init__() is the class constructor for Word2Vec, it is possible that when you instantiated the class with gensim.models.Word2Vec(), that the parameter documents is not actually necessary

try this instead:

model = gensim.models.Word2Vec(
    size=150,
    window=10,
    min_count=2,
    workers=10)
vencaslac
  • 2,727
  • 1
  • 18
  • 29
0

Looks like that model doesn't take the keyword parameter documents on initialization. I think you could try either of these in replacement of your documents= statement:

corpus_file = 'userDataFile.txt'

or

sentences = # your iterable of sentences here

Depending on the format of what you're working with

Sven Harris
  • 2,884
  • 1
  • 10
  • 20