2

I just study gensim for topic modeling. when I use

lda_model = gensim.models.ldamodel.LdaModel(...)

the result lda_model has two functions: get_topics() and get_document_topics(). I can find the topic-word and document-topics by them. But, I want to try:

hdp_lda_model = gensim.models.hdpmodel.HdpModel(...)

I can only find there is get_topics() in its result, no something like get_document_topics(). So I cannot find the relation of document and topics. But it should be somewhere. I read some instruction from https://radimrehurek.com/gensim/models/hdpmodel.html. But I did not find any (maybe I miss something?). So is there a function in hdp model, which is like get_document_topics() in lda model?

halfer
  • 19,824
  • 17
  • 99
  • 186
Feng Chen
  • 2,139
  • 4
  • 33
  • 62

1 Answers1

2

Both models have a __getitem__ method that does what you want.

For LDA it's actually a wrapper of get_document_topics https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/ldamodel.py#L1503

And for HDP it's wrapping the inference method but doing additionally more than just calling it: https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/hdpmodel.py#L427

So, to answer your question. You can do for both models:

lda_model[bow_doc]

or

hdp_lda_model[bow_doc]

and then get a topic distribution for bow_doc

Results in something like:

[(5, 0.05342164806543596),
 (7, 0.04307238446604077),
 (11, 0.5281130394662548),
 (31, 0.28899472194287035),
 (60, 0.07985460856925444)]
seb
  • 4,279
  • 2
  • 25
  • 36