0

I am trying to implement logistic regression cost function.
I tested my implementation and it works fine for different datasets.
But when I was trying to solve by new dataset, I realized that the second part of below equation (term2) always is inf. The problem is the value passed to the np.log() function is 0, so it gives me inf. Actually, the answer of sigmoid(hypothesis(x,theta)) = 1.

term1 = -y*(np.log(sigmoid(hypothesis(x,theta))))
term2 = ((1-y)*(np.log(1 - sigmoid(hypothesis(x,theta)))))
infunc1 = term1 - term2 
infunc2 = (lambda_*np.sum(theta[1:]**2))/(2*m)
j = (np.sum(infunc1)/m)+infunc2

The first solution I think is to add a very small value to 0 to prevent from inf. But I do not know this is correct or not. (based on this question).

What should I when the answer of multiplying some features to weights is zero and pass this answer to log?

Thanks for any advice. Happy coding

aolson512
  • 31
  • 1
  • 6

1 Answers1

0

As friends told, you just need to add a very small value to log function:

epsilon = 1e-5
term1 = -y*(np.log(epsilon+sigmoid(hypothesis(x,theta))))
term2 = ((1-y)*(np.log(epsilon+1 - sigmoid(hypothesis(x,theta)))))
infunc1 = term1 - term2 
infunc2 = (lambda_*np.sum(theta[1:]**2))/(2*m)
j = (np.sum(infunc1)/m)+infunc2