How does likelihood probability in sklearn multinomialnb calculated?

Question

We can print the class probability and feature probability (likelihood) by using class_log_prior and feature_log_prob_ . When I try to compare my own calculation with MultinomialNB sklearn, the class log prior is match but not with the feature log prob. I have watched youtube video about this and followed the equation on their site in here http://scikit-learn.org/stable/modules/naive_bayes.html#multinomial-naive-bayes But still, our feature log prob value does not same. I have many feature and it was categorical like word,postag,nextword,nextnextword,bigram, etc.

What I get from reading the equation are the

P(X|y)=(number of feature X in class y + alpha)/(number of all feature in class y + number of unique feature in all class)

So if I have feature Word:Hello and my class is Named Entity, OTHER, and I set the alpha to 1.0 it would become:

P(word:hello|OTHER)=(number of Word:Hello in class OTHER + 1)/(number of all feature in class OTHER + number of unique feature in all class)

Is this correct? Or maybe I am wrong in representing equation with more than one feature? Does anyone count it before? Or maybe can give some example in excel?

How does likelihood probability in sklearn multinomialnb calculated?

0 Answers0