2

I am using Mallet api to extract topic from twitter data and I have already extracted topics which are seems good topic. But I am facing problem to estimating K.

For example I fixed K value from 10 to 100. So, I have taken different number of topics from the data. But, now I would like to estimate which K is best. There are some algorithm I know as

  1. Perplexity
  2. Empirical likelihood
  3. Marginal likelihood (Harmonic mean method)
  4. Silhouette

I found a method model.estimate() which may be used to estimate with different value of K. But I am not getting any idea to show the value of K is best for the model. Does anyone give some idea about it with some sample code? Thanks.

Khaled
  • 255
  • 4
  • 16

0 Answers0