2

I am trying to generate vectors from a list of sentences.

x1 = 'Today I’d like to start a series of some posts concerning extreme value analysis using R.'
x2 = 'Basically, there are several very useful packages in R which provide methods and functions for extreme value analysis. Information on different software (including all relevant R packages) for extreme value analysis can of course be found at the R Task View on Extreme Value Analysis as well as on Eric Gilleland’s website'
x3 = 'In addition, Gilleland, Ribatet & Stephenson have published A software review for extreme value analysis back in 2012, which provides a comprehensive overview of the most important software tools related to this topic.'
self.sentences = [x1, x2, x3]

Then:

        documents = []
        for uid, line in enumerate(self.sentences):
            documents.append(LabeledSentence(line.split(), 'LOG_' + str(uid)))

        self.model_d2v = Doc2Vec(alpha=0.025, min_alpha=0.025, workers = self.workers, size = self.size)
        self.model_d2v.build_vocab(documents)
        for epoch in range(20):
            self.model_d2v.train(documents)
            self.model_d2v.alpha -= 0.002
            self.model_d2v.min_alpha = self.model_d2v.alpha

Then I have the error:

RuntimeError: you must first build vocabulary before training the model  

at the line train(documents).

I have no idea because I called build_vocab just before.

Could you give me some hints?

mommomonthewind
  • 4,390
  • 11
  • 46
  • 74

0 Answers0