5

I am trying to calculate the second derivative of a simple vector function of a scalar variable f(x) = (x,x^2,x^3) using TF 2.3 with tf.GradientTape.

def f_ab(x):
    return x, x** 2, x** 3

import tensorflow as tf
in1 = tf.cast(tf.convert_to_tensor(tf.Variable([-1,3,0,6]))[:,None],tf.float64)
with tf.GradientTape(persistent=True) as tape2:
    tape2.watch(in1)
    with tf.GradientTape(persistent=True) as tape:
        tape.watch(in1)
        f1,f2,f3 = f_ab(in1)
    df1 = tape.gradient(f1, in1)
    df2 = tape.gradient(f2, in1)
    df3 = tape.gradient(f3, in1)

d2f1_dx2 = tape2.gradient(df1, in1)
d2f2_dx2 = tape2.gradient(df2, in1)
d2f3_dx2 = tape2.gradient(df3, in1)

for some reason, only the last two derivative are correct while the first, d2f1_dx2, turned out to be None.

When I changed f_ab to

def f_ab(x):
    return x** 1, x** 2, x**3   

I got d2f1_dx2 = <tf.Tensor: shape=(1, 4), dtype=float64, numpy=array([[-0., 0., nan, 0.]])> which is "almost" the correct result.

only when I changed f_ab to

def f_ab(inputs_train):
    return tf.math.log(tf.math.exp(x) ), x** 2, x**3

I got the correct result: d2f1_dx2 = <tf.Tensor: shape=(1, 4), dtype=float64, numpy=array([[0., 0., 0., 0.]])>

Has anyone encountered this problem before? why is the straight forward way gives None?

ML85
  • 709
  • 7
  • 19
Benny K
  • 1,957
  • 18
  • 33
  • This look indeed like a weird behaviour to me. Consider maybe filling an [issue on github](https://github.com/tensorflow/tensorflow/issues). – Lescurel Jan 12 '21 at 10:56

1 Answers1

1

I think it's because the first derivative of x is a constant. So by the time you compute the second derivative Y and f1 are unrelated to each other since f1 is a constant.

In Tensorflow .gradient() method default to None if there is no deffirentiable path in the graph between the 2 variables.

gradient(
    target, sources, output_gradients=None,
    unconnected_gradients=tf.UnconnectedGradients.NONE
)

see Tensorflow doc

You can change this argument by 0 instead of None and you should get the expected result for the derivative of constant.

Yoan B. M.Sc
  • 1,485
  • 5
  • 18
  • No, the derivative of a constant should be zero – Benny K Jan 12 '21 at 18:37
  • @BennyK, I know, not sure what you are referring to. – Yoan B. M.Sc Jan 12 '21 at 18:45
  • 1
    @BennyK, my explanation still stands, I don't understand how this invalidate my answer. – Yoan B. M.Sc Jan 12 '21 at 18:56
  • The derivative of constant vector should be vector of zeros, not None. I did not use gradient but gradientTape – Benny K Jan 12 '21 at 22:02
  • @BennyK, you're doing `df1 = tape.gradient(f1, in1)` so you are using the gradient method. Now, what's happening is that gradient method default to `None` if there is no link between the two variables you are trying to differentiate. I agree that it should be `0` but what they put by default into `TF`is `None`. No you can change this behavior as I explain in my answer. – Yoan B. M.Sc Jan 13 '21 at 13:08
  • you can try by yourself that the first derivative of a constant vector is just vector of zeros – Benny K Jan 13 '21 at 13:54