Keras: Custom loss function with training data not directly related to model

Question

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.

According to this guide, when defining a loss function it expects the arguments (y_true, y_pred) https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses

def basic_loss_function(y_true, y_pred):
    return ...

However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?

To be concrete, here is my problem:

I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.

I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators. My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.

https://keras.io/examples/variational_autoencoder/ Perhaps this example can provide some insight as to how to write a loss when (y_true, y_pred) pattern breaks down. Looks like I should try `model.add_loss` with a custom loss function, but it is still unclear to me how to feed the data, especially considering my data doesn't fit into memory. — izak, Feb 04 '20 at 10:24
If you show how you train your model, what loss you use, etc, it would be a lot easier. I can't understand it just by text. — Daniel Möller, Feb 06 '20 at 19:50
Mainly we need to know what your model outputs, what is your loss function, what its inputs are, and how you use the points array. — Daniel Möller, Feb 06 '20 at 19:53
I think I got the idea now and updated the hack below to include external data that is not "sample to sample". — Daniel Möller, Feb 12 '20 at 13:15

Daniel Möller · Answer 1 · 2020-02-12T13:14:47.093

There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)

In a functional API model:

def loss_calc(x):
    loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
                                   #according to what you gave to the Lambda layer

    #here you use some external data that doesn't relate to the samples
    externalData = K.constant(external_numpy_data)


    #calculate the loss
    return the loss

Using the outputs of the model itself (the tensor(s) that are used in your loss)

loss = Lambda(loss_calc)([model_output_1, model_output_2])

Create the model outputting the loss instead of the outputs:

model = Model(inputs, loss)

Create a dummy keras loss function for compilation:

def dummy_loss(y_true, y_pred):
    return y_pred #where y_pred is the loss itself, the output of the model above

model.compile(loss = dummy_loss, ....)

Use any dummy array correctly sized regarding number of samples for training, it will be ignored:

model.fit(your_inputs, np.zeros((number_of_samples,)), ...)

Another way of doing it, is using a custom training loop.

This is much more work, though.

Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())

Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough

Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.

Finally, you can use the approach you suggested of model.add_loss. In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.

You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.

In this case, your model's output will probably be None too, and you should fit with y=None.

Could you give some advice on [this similar question](https://stackoverflow.com/questions/67415771/custom-adaptive-loss-function-with-additional-dynamic-argument-in-keras) too? I can not conclude from your answer how I should address my problem. Many thanks. — Basilique, May 06 '21 at 15:13

Keras: Custom loss function with training data not directly related to model

1 Answers1

Linked