8

I know that after training the lda model for gensim, we can get the topic for an unseen document by:

lda = LdaModel(corpus, num_topics=10)
doc_lda = lda[doc_bow]

But how about the documents that are already used for training? I mean is there a way to get the topic for a document in corpus that was used in training without treating it like a new document?

CentAu
  • 10,660
  • 15
  • 59
  • 85

1 Answers1

8

No.

Information from individual documents is distilled into the model, then forgotten. No per-document information is kept (more generally: no information that would require O(#docs) memory is kept).

Radim
  • 4,208
  • 3
  • 27
  • 38
  • But if I want to get the topic distribution for the trained document, so I can have some kind of clustering operation, then what I can do? – storen Feb 28 '17 at 23:01