How to calculate the joint log-likelihood for Bernoulli Naive Bayes

Question

For a classification problem using BernoulliNB , how to calculate the joint log-likelihood. The joint likelihood it to be calculated by below formula, where y(d) is the array of actual output (not predicted values) and x(d) is the data set of features.

I read this answer and read the documentation but it didn't exactly served my purpose. Can somebody please help.

can't you just use the attributes `.class_log_prior_` or `.feature_log_prob_` of the trained classifier? — Daneel R., Oct 17 '18 at 18:55
Hi Daniel, no cant's use them. Have to calculate according to this formula only. — Aman, Oct 17 '18 at 19:19

score 1 · Answer 1 · answered Oct 17 '18 at 19:43

1

By looking at the code, it looks like there is a hidden undocumented ._joint_log_likelihood(self, X) function in the BernoulliNB which computes the joint log-likelihood.

Its implementation is somewhat consistent with what you ask.

answered Oct 17 '18 at 19:43

Daneel R.

527
3
9

Khashkhuu Otgontulga · Answer 2 · 2023-04-14T09:21:00.490

- The solution is to count the positive input variables of the 
  positive output function.
- We achieve this by counting the 
  positive output variables or y of index y or y of 1 or y[1] or 
  data[idx][1].

- The first block of code is the **training** and *learning*.
- The second block of code is the **testing** and *counting*.

- train, test, train_labels, test_labels = train_test_split(Xs[0], 
  ys[0], test_size=1./3, random_state=r)
  naive = BernoulliNB(alpha= 10**-7)
  model = naive.fit(train, train_labels)
  joint_log_train = model._joint_log_likelihood(train)
- l = [np.append(x,y) for x, y in zip(train, train_labels)]

- # Write your code below this line.
- def count(data, label):
    x = 0
    for idx, l in enumerate(label):
        if (l == True):
            x += data[idx][1]
        else:
            x += data[idx][0]
    return x
- for i, (x, y) in enumerate(zip(Xs, ys)):
    train, test, train_labels, test_labels = train_test_split(x, y, 
    test_size=1./3, random_state=r)
    for j, a in enumerate(alphas):   
        naive = BernoulliNB(alpha = a)
        model = naive.fit(train, train_labels)
        joint_log_train = model._joint_log_likelihood(train)
        joint_log_test = model._joint_log_likelihood(test)
        train_jil[i][j] = count(joint_log_train, train_labels)
        test_jil[i][j] = count(joint_log_test, test_labels)

Please add more details to your answer explaining how it solves the problem — Aelius, Jul 19 '22 at 15:12

How to calculate the joint log-likelihood for Bernoulli Naive Bayes

2 Answers2