I'm trying to make categorical cross entropy loss function to better understand intuition behind it. So far my implementation looks like this:
# Observations
y_true = np.array([[0, 1, 0], [0, 0, 1]])
y_pred = np.array([[0.05, 0.95, 0.05], [0.1, 0.8, 0.1]])
# Loss calculations
def categorical_loss():
loss1 = -(0.0 * np.log(0.05) + 1.0 * np.log(0.95) + 0 * np.log(0.05))
loss2 = -(0.0 * np.log(0.1) + 0.0 * np.log(0.8) + 1.0 * np.log(0.1))
loss = (loss1 + loss2) / 2 # divided by 2 because y_true and y_pred have 2 observations and 3 classes
return loss
# Show loss
print(categorical_loss()) # 1.176939193690798
However I do not understand how function should behave to return correct value when:
- at least one number from
y_pred
is0
or1
because thenlog
function returns-inf
or0
and how code implementation should look like in this case - at least one number from
y_true
is0
because multiplication by0
always returns0
and value ofnp.log(0.95)
will be discarded then and how code implementation should look like in this case as well