0

I'm trying to calculate the perplexity of some English language texts using NLTK. I'm trying to figure out how a simple n-gram model will perform with less training samples. The thing I don't understand is why does perplexity get lower if I decrease the number of training samples? The number test samples are the same and it doesn't seem to have a bottom. I can go as low as I want.

I read some articles about perplexity and searched Stackoverflow, but couldn't find an explanation.

0 Answers0