How tf.gradients work in TensorFlow

Question

Given I have a linear model as the following I would like to get the gradient vector with regards to W and b.

# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.mul(X, W), b)

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)

However if I try something like this where cost is a function of cost(x,y,w,b) and I only want to gradients with respect to w and b:

grads = tf.gradients(cost, tf.all_variable())

My placeholders will also be included (X and Y). Even if I do get a gradient with [x,y,w,b] how do I know which element in the gradient that belong to each parameter since it is just a list without names to which parameter the derivative has be taken with regards to?

In this question I'm using parts of this code and I build on this question.

score 36 · Accepted Answer · edited May 05 '17 at 14:01

36

Quoting the docs for tf.gradients

Constructs symbolic partial derivatives of sum of ys w.r.t. x in xs.

So, this should work:

dc_dw, dc_db = tf.gradients(cost, [W, b])

Here, tf.gradients() returns the gradient of cost wrt each tensor in the second argument as a list in the same order.

Read tf.gradients for more information.

edited May 05 '17 at 14:01

buydadip

8,890
22
79
154

answered Jan 24 '17 at 08:47

Priyatham

2,821
1
19
33

5

Thanks a small example makes all the difference! – user3139545 Jan 24 '17 at 09:45

How tf.gradients work in TensorFlow

1 Answers1