6

I'm attempting to load some pre-trained vectors into a gensim Word2Vec model, so they can be retrained with new data. My understanding is I can do the retraining with gensim.Word2Vec.train(). However, the only way I can find to load the vectors is with gensim.models.KeyedVectors.load_word2vec_format('path/to/file.bin', binary=True) which creates an object of what is usually the wv attribute of a gensim.Word2Vec model. But this object, on it's own, does not have a train() method, which is what I need to retrain the vectors.

So how do I get these vectors into an actual gensim.Word2Vec model?

Maxim
  • 52,561
  • 27
  • 155
  • 209
Mike S
  • 1,451
  • 1
  • 16
  • 34

1 Answers1

1

Word2Vec.load is not deprecated, so you can use it, assuming that your pre-trained model has been saved with Word2Vec.save.

# Train and save the model
model = Word2Vec(size=100, window=4, min_count=5, workers=4)
model.build_vocab(sentences)
model.train(sentences, total_examples=model.corpus_count, epochs=50)
model.save('word-vectors.bin')

...

# Later in another script: load and continue training
model = Word2Vec.load('word-vectors.bin')
model.train(sentences, total_examples=model.corpus_count, epochs=50)
Maxim
  • 52,561
  • 27
  • 155
  • 209