0

In the official explanation, there is no natural ordering between the topics in LDA.

As for the method show_topics(), if it returned num_topics <= self.num_topics subset of all topics is therefore arbitrary and may change between two LDA training runs.

But I tends to find the top ten frequent topics of corpus. Is there any other ways to achieve this?

Many thanks.

2 Answers2

2

Like the documentation says, there is no natural ordering between topics in LDA. If you have your own criterion for ordering the topics, such as frequency of appearance, you can always retrieve the entire list of topics from your model and sort them yourself.

However, even the notion of "top ten most frequent topics" is ambiguous, and one could reasonably come up with several different definitions of frequency. Do you mean the topic that has been assigned to the largest number of word tokens? Do you mean the topic with the highest average proportions among all documents? This ambiguity is the reason gensim has no built-in way to sort topics.

Benjamin Bray
  • 416
  • 4
  • 14
0

In the gensim LDA documentation, the following method in enlisted:

top_topics(corpus=None, texts=None, dictionary=None, window_size=None, coherence='u_mass', topn=20, processes=-1)

This could be helpful.

royn
  • 1