Meaning of Loss function in Keras?

Question

i made a neural network with keras in python and cannot really understand what the loss function means.

So here first some general information: i worked with the poker hand dataset with classes 0-9, which i wrote as vectors with the OneHotEncoding. I used the softmax activation in the last layer, so my output tells me for each of the 10 entries in a vector the probability if the sample belongs to a certain class. For example: my real input it (0,1,0,0,0,0,0,0,0,0), which means class 1 (from 0-9 means from no card to royal flush), and class 1 means one pair (if you know poker). With the neural net, it get at the and Outputs like (0.4, 0.2, 0.1, 0.1, 0.2, 0,0,0,0,0), which means that my sample belongs with 40 percent to class 0, with 20 percent to class 1 and so on!

Allright! i used also the binary cross_entropy as loss, the accuracy-metrics and the RMSprop-Optimizer. When i use mode.evaluate() from keras, i got something like 0.16 for the loss and i do not know how to interpret this. Does this mean, that in average, my predictions deviate 0.16 from the true? so if my prediction for class 0 is 0.5, it also could be 0.66 or 0.34? Or how can i interpret it?

Please send help!

Why do you use binary cross entropy when you have a multi-class problem? — Code Pope, May 04 '20 at 09:38
https://developers.google.com/machine-learning/crash-course/descending-into-ml/training-and-loss might be a good start to read up on. Once you understand the loss, you can then look into what loss is being used in your model. Should be MSE as well. — Jason Chia, May 04 '20 at 09:38
thanks Jason, i think i understand the loss, my problem is more the computation with the keras model.evaluate()! i use binary crossentropy because i use first OneHotEncoding — Eli Hektor, May 04 '20 at 09:58

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

First at all, according to your problem definition you have a multi-class problem. Thus, you should use categorical_crossentropy. Binary cross_entropy is for two-class problems or for multi-label classification.
But generally the value of the loss function has a relative impact value. First at all, you have to understand what the cross_entropy is meaning. The formula is:

where c is the correct classification of observation o and
y is the binary indicator (0 or 1) if class label c is the correct classification for observation o and p is the predicted probability that o is of class c.
For binary cross entropy, M is equal to 2. For categorical cross entropy, M>2. Therefore, the cross entropy decreases if the predicted probability converges to the actual label:

Now let's take your example, where you have 10 classes and your real input is: (0,1,0,0,0,0,0,0,0,0). If you have a loss of 0.16, it means that
which means that your model has assigned 0.85 to the correct label.
Therefore, the loss function gives you the log of the correct classification probability. As in keras the loss is computed on whole batches, it is the average of the log of the correct classification probability of the whole data in the specific batch. If you use the evaluate function, then it is the average of the log of the correct classification probability of the whole data you are evaluating.

Thank you! This is exactely what i wanted to see/hear :) I understand now! But one more question. If i used OneHotEncoding to represent my 10 classes as vectors or as an array, this means as (1,0,0,...) for class 0, (0,1,0,0,...) for class 1 and so on- why shouldn't i use binary cross entropy? — Eli Hektor, May 04 '20 at 11:21

Meaning of Loss function in Keras?

1 Answers1