Correct way to take gradients in tensorflow?

Question

I am trying to use a new NN architecture named Physics Informed Neural Networks, the aim is to solve a PDE or ODE using NN. So I have to take the derivative of my network with respect to the data, as in the u_prima function below, and then with respect to my model but I have an error. My model definition is:

#Crea el modelo
inputs = ks.Input(shape=(1),dtype=tf.float32,name='entrada')
x1 = ks.layers.Dense(64, activation="relu",name='oculta1')(inputs)
x2 = ks.layers.Dense(64, activation="relu",name='oculta2')(x1)
outputs = ks.layers.Dense(1,name='salida')(x2)
model = ks.Model(inputs=inputs, outputs=outputs)
model.summary()
#Instancia un optimizador
optimizador = ks.optimizers.Adam()

def u_prima(model,t):
    with tf.GradientTape() as tp:
        tp.watch(t)
        u = model(t,training=True)
    u_t = 1 + t*tp.gradient(u,t)
    return u_t

this derivate will compose my loss function used for the training of my model parameters, I tried to do that as follows:

def loss(model,u_prima,f_x,t):
  u_t = u_prima(model,t)
  return tf.reduce_mean(tf.square(u_t - f_x(t)))
loss(model,u_prima,f,t)

Finally, I run my training loop and get the error:

epochs = 10
for epoch in range(epochs+1):
  with tf.GradientTape() as tp:
    perdida = loss(model,u_prima,f,t)
  grad = tp.gradient(perdida,model.trainable_variables)
  optimizador.apply_gradients(zip(grad,model.trainable_variables))
  print(perdida)

The error I get is:

WARNING:tensorflow:Gradients do not exist for variables ['salida/bias:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss`argument?
tf.Tensor(0.9643340489229794, shape=(), dtype=float64)

There is no issue here. The output layer bias simply has no influence on the gradient wrt the input computed in `u_prima`, and thus no influence on the loss. You can include `use_bias=False` in the last dense layer to avoid the warning. — xdurch0, Feb 15 '22 at 23:22
It is not an error but a warning which makes sense if you do the math. The loss you have defined is l = (1 + t* d(salida.weight*x_2 + salida.bias)/dt - f(t)^2) , which simplifies to (1 + t*d(salida.weight*x_2)/dt - f(t)^2). The loss does not seem to depend on the bias term of the output layer. — Saswata Chakravarty, Feb 15 '22 at 23:29

Correct way to take gradients in tensorflow?

0 Answers0