0

As said in https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer?hl=en#minimize, the first parameter of minmize should satisfy the requirement,

Tensor or callable. If a callable, loss should take no arguments and return the value to minimize. If a Tensor, the tape argument must be passed.

The first piece of code takes tensor as the input of minimize(), and it requires the gradient tape, but I don't know how.

The second piece of code takes callable function as the input of minimize(), which is easy

import numpy as np
import tensorflow as tf
from tensorflow import keras

x_train = [1, 2, 3]
y_train = [1, 2, 3]

W = tf.Variable(tf.random.normal([1]), name='weight')
b = tf.Variable(tf.random.normal([1]), name='bias')
hypothesis = W * x_train + b


@tf.function
def cost():
    y_model = W * x_train + b
    error = tf.reduce_mean(tf.square(y_train - y_model))
    return error


optimizer = tf.optimizers.SGD(learning_rate=0.01)

cost_value = cost()
train = tf.keras.optimizers.Adam().minimize(cost_value, var_list=[W, b])

tf.print(W)
tf.print(b)

How to add the gradient tape, I know the following code certainly works.

import numpy as np
import tensorflow as tf
from tensorflow import keras

x_train = [1, 2, 3]
y_train = [1, 2, 3]

W = tf.Variable(tf.random.normal([1]), name='weight')
b = tf.Variable(tf.random.normal([1]), name='bias')
hypothesis = W * x_train + b


@tf.function
def cost():
    y_model = W * x_train + b
    error = tf.reduce_mean(tf.square(y_train - y_model))
    return error


optimizer = tf.optimizers.SGD(learning_rate=0.01)

cost_value = cost()
train = tf.keras.optimizers.Adam().minimize(cost, var_list=[W, b])

tf.print(W)
tf.print(b)

Please help me revise the first piece of code and let it run, thanks!

Qiqin Zhan
  • 71
  • 1
  • 6

3 Answers3

4

This is a late answer (Hakan basically got it for you), but I write this in hopes that it will help people in the future that are stuck and googling this exact question (like I was). This is also an alternate implementation using the tf.GradientTape() directly.

import numpy as np
import tensorflow as tf
from tensorflow import keras

x_train = [1, 2, 3]
y_train = [1, 2, 3]

W = tf.Variable(tf.random.normal([1]), trainable = True, name='weight')
b = tf.Variable(tf.random.normal([1]), trainable = True, name='bias')

@tf.function
def cost(W, b):
    y_model = W * x_train + b
    error = tf.reduce_mean(tf.square(y_train - y_model))
    return error


optimizer = tf.optimizers.SGD(learning_rate=0.01)

trainable_vars = [W,b]

epochs = 100 #(or however many iterations you want it to run)
for _ in range(epochs):
    with tf.GradientTape() as tp:
        #your loss/cost function must always be contained within the gradient tape instantiation
        cost_fn = cost(W, b)
    gradients = tp.gradient(cost_fn, trainable_vars)
    optimizer.apply_gradients(zip(gradients, trainable_vars))

tf.print(W)
tf.print(b)

This should give you the value of your weights and biases after the number of epochs you ran.

You must compute the loss function everytime a new gradient tape is invoked. Then you get the gradient of your loss function, and then call optimizer.apply_gradient to do your minimization according to what tensorflow documentation says here: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer#apply_gradients.

AndrewJaeyoung
  • 368
  • 3
  • 10
3

This occurs because .minimize() expects a function. While cost_value&cost(), is a tf.Tensor object, cost is a tf.function. You should directly pass your loss function into the minimize as tf.keras.optimizers.Adam().minimize(cost, var_list=[W, b]).

Changed part for Gradient:

train = tf.keras.optimizers.Adam().minimize(cost(), var_list=[W, b],tape=tf.GradientTape())
Hakan Akgün
  • 872
  • 5
  • 13
  • As said in https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer?hl=en#minimize, in addition to the function, minimize() could also take tensor as the input. – Qiqin Zhan Aug 22 '21 at 09:45
  • I think it's not usable anymore. There is an explanation for that in https://github.com/tensorflow/tensorflow/issues/42447#issuecomment-675778979 Also if you want to explicitly manipulate your gradients you can use tf.GradientTape() methods. – Hakan Akgün Aug 22 '21 at 09:58
  • 1
    According to the interface of tf.keras.optimizers.Adam().minimize(loss, var_list, grad_loss=None, name=None, tape=None), if the tape parameter is given, there should not be errors. But giving the tape parameter is a challenge. – Qiqin Zhan Aug 22 '21 at 10:04
  • 2
    Providing tf.GradientTape() as tf.keras.optimizers.Adam().minimize(cost(), var_list=[W, b],tape=tf.GradientTape()) This eliminates that error but creates a new one as "No gradients provided for any variable: ['weight:0', 'bias:0']" – Hakan Akgün Aug 22 '21 at 10:13
  • I've added the part that I've changed. – Hakan Akgün Aug 22 '21 at 10:26
0

Optimization process needs to iterate over and over for better result. Wrap computation inside a GradientTape for automatic differentiation.

with tf.GradientTape() as g:
    pred = W * X + b # Linear regression.
    loss = tf.reduce_mean(tf.square(pred - Y))

Compute gradients.

gradients = g.gradient(loss, [W, b])

Update W and b following gradients.

optimizer.apply_gradients(zip(gradients, [W, b]))