Strange perplexity values of LDA model trained with MALLET

Question

I have trained an LDA model with MALLET on parts of the Stack Overflow data dump and did a 70/30 split for training and test data.

But the perplexity values are strange, because they are lower for the test set than for the training set. How is this possible? I thought the model is better fitted for the training data?

I have already double checked my perplexity calculations, but I do not find an error. Do you have any idea what the reason could be?

Thank you in advance!

Edit:

Instead of using the console output for the LL/token values of the training set, I have used the evaluator on the training set again. Now the values seem to be plausible.

score 3 · Accepted Answer · answered Apr 25 '17 at 01:09

3

That makes sense. The LL/token number is giving you the probability of both topic assignments and the observed words, whereas the held-out probability is giving you the marginal probability of just the observed words, summed over topics.

answered Apr 25 '17 at 01:09

David Mimno

1,836
7
7

Strange perplexity values of LDA model trained with MALLET

1 Answers1