0

Learners,

I want to train a NN with mini-batches using a custom loss function. Every mini-batch contains n new samples and m replay samples. The replay samples are used for replay to avoid forgetting.

my loss fnc looks like:

loss=mse(new_samples_truth, new_samples_pred) + factor*mse(replay_samples_truth, replay_sampls_pred)

As you can see the loss is a weighted sum of two mse calculated separately for the new samples and replayed samples. That means whenever I want to train a batch, I want to separate the new and replay data points and calculate a scalar loss for the entire batch.

How can I implement this loss function in Keras and use it with train_on_batch? The train_on_batch method from Keras seems to calculate the loss with the loss function for every data point in the mini-batch separately. Since my batch contains new and replay datapoints this will not work. So how can I Keras make calculate the Loss for the entire batch at once and return only once scalar? Also it seems as if Keras evaluates the loss fnc for every data point in batch separately and saves the losses per sample in an array. However, I want to get the loss for the entire batch. Does anybody understand how Keras actually handles the loss calculation for batches?

Here is my pseudo code

    batch=pd.concat([new_samples, replay_samples]) #new_samples and replay_samples are pd.dataframes
    #len(batch) = 20

   def my_replay_loss(factor):
       def loss(y_true, y_pred): #y_true and y_pred come from keras 

           y_true_new_samples = y_true.head(10)
           y_pred_new_samples = y_pred.head(10)
           y_true_replay_samples = y_true.tail(10)
           y_pred_replay_samples = y_pred.tail(10)
           
           calc_loss = mse(y_true_new_samples, y_pred_new_samples) + factor*mse(y_true_replay_samples, y_pred_replay_samples)
           return calc_loss

        return loss

'''

ashraful16
  • 2,742
  • 3
  • 11
  • 32
Perschi
  • 11
  • 2

1 Answers1

0

You can define a custom loss function similarly as you did but you need to construct a tf graph with tf operations, here an example:

def my_replay_loss(factor):
    def loss(y_true, y_pred):
        calc_loss = tf.math.add(tf.keras.losses.mse(y_true[:10, :], y_pred[:10, :]), 
                                tf.math.multiply(factor, tf.keras.losses.mse(y_true[-10:, :], y_pred[-10:, :])))
        return calc_loss
    return loss

Then you can compile your model with:

loss=my_replay_loss(factor=tf.constant(0.5, dtype=tf.float32))

However, in your case I would usggest to do not construct your loss function based on the order of the data points in the batch but I would suggest to construct a model with 2 inputs and 2 outputs (eventually sharing the model architecture) and compile them with 2 losses. Keras models support the loss_weights parameter that allows you to weigths the two losses as you prefer.

loss_weights: Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model's outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.

Here a quick example:

import tensorflow as tf

def get_compiled_model(input_shape):
    
    input1 = tf.keras.Input(input_shape)
    input2 = tf.keras.Input(input_shape)

    conv_layer = tf.keras.layers.Conv2D(128, 3, activation='relu')
    flatten_layer = tf.keras.layers.Flatten()
    dense_layer = tf.keras.layers.Dense(1, activation='softmax')

    x1 = conv_layer(input1)
    x1 = flatten_layer(x1)
    output1 = dense_layer(x1)

    x2 = conv_layer(input2)
    x2 = flatten_layer(x2)
    output2 = dense_layer(x2)
    
    model = tf.keras.Model(inputs=[input1, input2], outputs=[output1, output2])
    
    model.compile(optimizer=tf.keras.optimizers.Adam(), 
                  loss=[tf.keras.losses.mse, tf.keras.losses.mse],
                  loss_weights=[1, 0.5])
    return model

model = get_compiled_model(input_shape=x_train[0].shape)
model.fit([x_train1, x_train2],  [y_train1, y_train2], epochs=5, batch_size=10)