Inconsistancies of loss function results with Keras

Question

I am implementing a CNN coupled to a multiple instance learning layer. In brief, I've got this, with C the number of categories:

    [1 batch of images, 1 label] > CNN > Custom final layer -> [1 vector of size C]

My final layer just sums up the previous layer for the moment. To be clear, 1 batch of inputs only gives 1 single ouput. The batch corresponds therefore to multiple instances fetched in 1 single bag associated to 1 label.

When I train my model and validate it with the same set:

history = model.fit_generator(
    generator=training_generator,
    steps_per_epoch=training_set.batch_count,
    epochs=max_epoch,
    validation_data=training_generator
    validation_steps=training_set.batch_count)

I've got 2 different results betwen the training and the validation sets, in spite of being the same:

  35/35 [==============================] - 30s 843ms/step - loss: 1.9647 - acc: 0.2857 - val_loss: 1.9403 - val_acc: 0.3714

The loss function is just the categorical cross entropy as implemented in Keras (I've got 3 categories). I have implemented my own loss function to get some insight about what happens. Unfortunately, I obtain another inconsistency between the regular loss and my custom loss function:

  35/35 [==============================] - 30s 843ms/step - loss: 1.9647 - acc: 0.2857 - bag_loss: 1.1035 - val_loss: 1.9403 - val_acc: 0.3714 - val_bag_loss: 1.0874

My loss function:

def bag_loss(y_true, y_predicted):
    y_true_mean = keras.backend.mean(y_true, axis=0, keepdims=False)
    y_predicted_mean = keras.backend.mean(y_predicted, axis=0, keepdims=False)
    loss = keras.losses.categorical_crossentropy(y_true_mean, y_predicted_mean)
    return loss

The final layer of my model (I only shown the call part, for concision):

def call(self, x):
    x = kb.sum(x, axis=0, keepdims=True)
    x = kb.dot(x, self.kernel)
    x = kb.bias_add(x, self.bias)
    out = kb.sigmoid(x)

    return out

After inspecting the code with TensorBoard and the TensorFlow Debugger, I have found out that, indeed, my beg loss and the regular loss return the same value at somme point. But then, Keras perform 6 supplemental additions on the regular sigmoid loss (1 for each layer in my model). Can someone help my to entangle this ball of surprising results? I expect the regular loss, the validation loss and my bag loss to be the same.

I realized that the feeding of my network involves a random process at some point (I'm using a data generator). Although i'm using the same data set for training and validation, the random state is not the same. When I provide the same state for each set (in fact each generator), I retrieve consistent results between bag_loss and val_bag_loss. However, I still do not have an explaination about the difference between loss and bag_loss. — beardybear, Sep 11 '19 at 07:51

score 0 · Accepted Answer · answered Sep 11 '19 at 09:56

OK, I finally found the culprit: The L2 regularization, which was turned on, while I thought it was off. The regulirization term is obviously added to the cross-entropy to calculate the effective loss. Most of the time, a good night of sleep and a careful inspection your code are more than enough to answer your question.

Inconsistancies of loss function results with Keras

1 Answers1