finding log likelihood data using numpy

Question

I am trying to use numpy to get the log likelihood for native bayes The following is the probability of getting 1 in each dimension when label is +1 and -1 repectively:

positive = [0.07973422 0.02657807]
negative = [0.04651163 0.02491694] #both of these have the dimension d

the following are the test and label for the test

x = np.array([[0,1],[1,0],[1,1]]) # dimension is n*d : note that the d is same as above
y = np.array([-1,1,-1]) #dimension is n

#result that I want

result = [-3.73983529 -2.55599409 -6.76026018] #dimension is n

logic-> each result element corresponds to a row in x which depends on what value of y to use to use the positive and negative

i.e.: for row 0, i.e. [0,1], the label -1, that means we take the posprob.

-3.73983529 = log( 1 - 0.04651163 ) + log(0.02491694)

, here we are subtracting from 1 because the probability of 0 is 1 minus probability of 1.

I am using tight loops right now. But I want to solve this using numpy methods to make it faster.

It might help if you would paste add the "tight loops" that you're using now into the body of your question so that we could see them — Ben Grossmann, Mar 14 '23 at 19:24

score 0 · Accepted Answer · answered Mar 14 '23 at 19:51

Cast everything to n x d and then use np.where.

positive = [0.07973422, 0.02657807]
negative = [0.04651163, 0.02491694]  # both of these have the dimension d

x = np.array(
    [[0, 1], [1, 0], [1, 1]]
)  # dimension is n*d : note that the d is same as above
y = np.array([-1, 1, -1])  # dimension is n

d = len(positive)
n = len(x)

# Cast all to n x d

positive = np.array([positive]*n)
negative = np.array([negative]*n)

y = np.repeat(y, d).reshape(n, d)

# Determine whether to use pos or neg probabilities
pos_neg = np.where(y == 1, positive, negative)

# Determine whether to use prob or 1-prob
probs = np.where(x == 0, 1 - pos_neg, pos_neg)

# Take logs and then sum
log_probs = np.log(probs)

log_like = np.sum(log_probs, axis = 1)

print(log_like)

score 0 · Answer 2 · answered Mar 29 '23 at 18:52

0

probs = np.array([negative,positive])[(y+1)//2]
np.log(np.where(x==0, 1 - probs, probs)).sum(1)

array([-3.73983544, -2.55599408, -6.76026028])

answered Mar 29 '23 at 18:52

Onyambu

67,392
3
24
53

finding log likelihood data using numpy

2 Answers2