0

To calculate the second derivative of a neural network output with respect to the input using a keras model, I implemented codes used in those 2 posts :

However, for both techniques, the second derivative is always equal to 0. Here is the problem reproduced on a simple regression of the quadratic function:

import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.backend import set_session
from tensorflow.keras import datasets, layers, models
from tensorflow.keras import optimizers
import numpy as np


x = np.linspace(0, 1, 1000)
y = x**2

def model_regression():
  model = tf.keras.Sequential([
  layers.Dense(1024, input_dim=1, activation="relu"),
  layers.Dense(1, activation="linear")])
  model.compile(optimizer=optimizers.Adam(lr=0.001),
           loss='mean_squared_error')
  return model

my_model = model_regression()
my_model.fit(x, y, epochs=10, batch_size=20)

y_pred = my_model.predict(x)

#### Technique 1
first_input = K.gradients(my_model.output, my_model.input)
second_input = K.gradients(first_input, my_model.input)
iterate_first = K.function([my_model.input], [first_input])
iterate_second = K.function([my_model.input], [second_input])

#### Technique 2
# def grad(y, x):
#     print(y.shape)
#     return layers.Lambda(lambda z: K.gradients(z[0], z[1]), output_shape=[1])([y, x])
# derivative1 = grad(my_model.output, my_model.input)
# derivative2 = grad(derivative1, my_model.input)
# iterate_first = K.function([my_model.input], [derivative1])
# iterate_second = K.function([my_model.input], [derivative2])

first_derivative, second_derivative = [], []

for i in range(x.shape[0]):
    first_derivative.append(iterate_first(np.array([[x[i]]]))[0][0][0])
    second_derivative.append(iterate_second(np.array([[x[i]]]))[0][0][0])

Here is the graphical result, as you can see the second derivative is always equal to 0 while it should approximatively be equal to 2. How can I compute correctly the second derivative of my neural network in keras ?

enter image description here

  • I thought it was possible to compute since TensorFlow has the tf.hessians(ys, xs) command computing the second derivative. Maybe I am wrong, I will compute my second derivative numerically. Thanks – laurent Bimont Apr 07 '20 at 09:48
  • Interesting.... I must say that I'm "not sure" of what I said, of course. – Daniel Möller Apr 07 '20 at 12:43
  • Out of curiosity, what happens if you take the gradients with relation to the model's weights? `model.get_layer("name_of_first_dense").kernel`? – Daniel Möller Apr 07 '20 at 12:50
  • 1
    By replacing my RELU layers by sigmoïd ones, I got values for my second derivative fluctuating around 2. So apparently, the second derivative is defined somewhere but it does not work well for RELU layers. – laurent Bimont Apr 07 '20 at 12:57
  • Whoa... if you think of it, the second derivative of a relu is actually zero. (Deleted my first comment which was clearly wrong). – Daniel Möller Apr 07 '20 at 13:04
  • I am afraid that the second derivative of RELU is 0. As a result, the overall second derivative of the neural network is 0 too. – laurent Bimont Apr 07 '20 at 13:09

0 Answers0