Keras custom loss function using other arguments than y_pred and y_true

Question

I was thinking the idea of using a perceptron-like neural network to solve my problem. I have a dataset that, for the sake of simplicity, looks like this:

id entryWoodLength entryWoodThickness cuttingToolPos1 cuttingToolPos2 exitWoodLength exitWoodThickness
1        5.5              1.6               2.1             2.2             4.2            1.6
2        5.7              1.5               2.2             2.6             4.2            1.5
3        6.5              1.8               2.6             2.7             4.3            1.6
4        5.9              1.7               2.4             2.9             4.2            1.5
5        5.8              1.5               2.2             2.6             4.1            1.5

And I had the thought of trying a fowardfeeding neural network where the input would the wood dimensions (entryWoodLenth and entryWoodThickness) and the output would be the position of the cutting tools (cuttingToolPos1 and cuttingToolPos2). We already know what the ideal dimension of the exit wood should be (4.2 for length and 1.5 for thickness, say). So we would technically want our network to optimize itself on the real values of the wood (exitWoodLength and exitWoodThickness). That means using the MSE of exitWoodLength and exitWoodThickness with the references values of 4.2 and 1.5, in something like that:

mean_squared_error(exitWoodLength, 4.2) + mean_squared_error(exitWoodThickness, 1.5)

However, Keras only allows custom loss functions that make use of the y_pred and y_true arguments, which in our case would be cuttingToolPos1 and cuttingToolPos2, not the values we want for the loss function. I was thinking of using a closure function and simply ignore the y_pred and y_true arguments, something in the sense of:

def custom_loss(exitWoodLength, exitWoodThickness):

    def loss(y_pred, y_true):

        mean_squared_error(exitWoodLength, 4.2) + mean_squared_error(exitWoodThickness, 1.5)

        return loss

But I am worried about indexes and if it's even feasible at all.

Has anyone ever experienced something similar? Am I on a correct path or completely wrong of using neural networks at all?

score 0 · Answer 1 · edited Jun 15 '20 at 01:32

To my understanding, with your current loss function you will not be able to train the neural net. The current inputs to the loss function are not outputs of the neural net. Make sure that y_true is your target variable and y_pred the prediction (output) of the neural net.

Correct me if I'm wrong: Based on some entryWood lengths/thicknesses you want to predict cutting positions that obtains the best results (exitWood lengths/thicknesses)? So in 'real life' you would begin with only having the entryWood values?

This option is probably not satisfactory, but one first step could be to use entryWood and cuttingPos as input, and exitWood as the target variable. In that way the neural net could learn to predict the existWood based on certain entryWood/cuttingPos. Then for the second step, [based on some entryWood values] you could search for cuttingPos that gives 4.2/1.5 existWood. At this moment, I'm not quite sure how you could [elegantly] make this into a single (end-to-end) model

Example:

import pandas as pd
import numpy as np
import tensorflow as tf # (TensorFlow version 2.2)

# define dataset
df = pd.DataFrame(
    {'entryWoodLength':    [5.5, 5.7, 6.5, 5.9, 5.8],
     'entryWoodThickness': [1.6, 1.5, 1.8, 1.7, 1.5],
     'cuttingToolPos1':    [2.1, 2.2, 2.6, 2.4, 2.2],
     'cuttingToolPos2':    [2.2, 2.6, 2.7, 2.9, 2.6],
     'exitWoodLength':     [4.2, 4.2, 4.3, 4.2, 4.1],
     'exitWoodThickness':  [1.6, 1.5, 1.6, 1.5, 1.5]}
)

# STEP 1
# create and compile model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(8, activation='relu', dtype='float32'), # hidden layer
    tf.keras.layers.Dense(2, activation='linear', dtype='float32'), # output layer
])
# mse = (exitWood_pred - exitWood_true)^2
model.compile(loss='mse', optimizer='sgd')

X = df.iloc[:, :4] # entryWoodLength, entryWoodThickness, cuttingToolPos1, cuttingToolPos2
y = df.iloc[:, 4:] # exitWoodLength, exitWoodThickness

# fit to the data, find a relationship between X and y
history = model.fit(X, y, epochs=100, verbose=0)


# STEP 2
# dummy example 
entry_wood_dimemsions = [6.0, 1.8]
# let's try some "random" tool_position (this has to be searched for)
cut_tool_position = [2.1, 2.7]
# model input
x = np.array([[entry_wood_dimemsions + cut_tool_position]])
# predict based on entry_wood_dimemsions and [manually selected] cut_tool_position
exitWood_pred = model.predict(x)

print("existWood pred =", np.squeeze(exitWood_pred))

>> existWood pred = [4.078477  1.4972893]

Remember that [especially in this dummy case], the model (the neural net) is describing the relationship between input and output very poorly (so I wouldn't trust the prediction at this point). But with more data, this model will (likely) get better. Also remember that I don't know the topic/data you're working on/with, so these are just some ideas/examples that could possibly help you get started/guide you.

score 0 · Answer 2 · answered Jun 03 '20 at 13:59

From your problem statement, it seems like you need to first train your neural network and find the optimized input variables. However, from your code it looks like you are having trouble defining a custom loss. Although I do not understand your problem statement clearly, you can do this to make a custom loss function (extended from akensert's answer):

from keras import backend as K 
#custom loss
def custom_error(y_true, y_pred):
        return K.mean(K.square(y_pred[:,0] - 4.2)) +K.mean(K.square(y_pred[:,1] - 1.5))

model.compile(loss=custom_error, optimizer='sgd')

history = model.fit(X, y, epochs=100, verbose=1)

score 0 · Answer 3 · answered Jun 03 '20 at 19:01

From my understanding, you either want your machine to cut the entry block close to the required dimension (exit block), for that you are trying to predict the required tool positions or the other way around (not clear in your question). You can do model this problem in two ways.

Option 1: Predicting the optimal tool position to produce the output block with the required dimension for an input block with a specific dimension. Input: Entry block (thickness and length) and Exit block (thickness and length); Output: optimal tool positions.

Option 2: Predict the output block size for the given input block dimensions and tool positions. Input: Entry block (thickness and length) and tool positions; Output: Exit block (thickness and length).

Both are valid ways to model the system, you can choose one depending on the application, i.e., depending on what do you want to predict, is it the dimension of the exit block or is it the optimal tool positions for the given input and desired output.

Both of these options can be modeled easily with your MSE loss function.

Keras custom loss function using other arguments than y_pred and y_true

3 Answers3