1

I am recently working with Mallet to conduct LDA Topic Modeling. I recognized that I am able to pass the alpha hyperparameter for the algorithm to Mallet, but the LDAMallet class does not contain any variable for the beta parameter. Can you guys tell me how that comes? I know I can turn on hyperparameter optimization every n intervals, which will recalculate an optimal value for the parameters, but even there I dont know by what criteria they are optimized.

Best, Nero

1 Answers1

0

I'm assuming you're referring to the gensim wrapper? You can specify beta values from command-line Mallet, so there's no reason this couldn't be implemented in Python, but you're correct that it's not there now.

In practice, the default value of 0.01 is almost always close to optimal for natural language data, which is why I suspect no one has implemented it in gensim.

David Mimno
  • 1,836
  • 7
  • 7
  • Yes, thank you for the answer. As I am doing this for scientific research, I will need some evidence taht supports the claim of beta = 0,01 being optimal in that case - then I will be fine. I will look out for such. –  May 20 '20 at 09:16
  • 1
    I can say in practice I've never seen an optimized beta go much above 0.02 or below 0.05. If you can find a reference about that, please comment, but I suspect it's not the kind of result that is considered publishable by itself. – David Mimno May 20 '20 at 14:31