I am new to doc2vec. I was initially trying to understand doc2vec and mentioned below is my code that uses Gensim. As I want I get a trained model and document vectors for the two documents.
However, I would like to know the benefits of retraining the model in several epoches and how to do it in Gensim? Can we do it using iter
or alpha
parameter or do we have to train it in a seperate for loop
? Please let me know how I should change the following code to train the model for 20 epoches.
Also, I am interested in knowing is the multiple training iterations are needed for word2vec model as well.
# Import libraries
from gensim.models import doc2vec
from collections import namedtuple
# Load data
doc1 = ["This is a sentence", "This is another sentence"]
# Transform data
docs = []
analyzedDocument = namedtuple('AnalyzedDocument', 'words tags')
for i, text in enumerate(doc1):
words = text.lower().split()
tags = [i]
docs.append(analyzedDocument(words, tags))
# Train model
model = doc2vec.Doc2Vec(docs, size = 100, window = 300, min_count = 1, workers = 4)
# Get the vectors
model.docvecs[0]
model.docvecs[1]