Tensorflow - No gradients provided for any variable

Question

I am experimenting some code on Jupyter and keep getting stuck here. Things work actually fine if I remove the line starting with "optimizer = ..." and all references to this line. But if I put this line in the code, it gives an error.

I am not pasting all other functions here to keep the size of the code at a readable level. I hope someone more experienced can see it at once what is the problem here.

Note that there are 5, 4, 3, and 2 units in input layer, in 2 hidden layers, and in output layers.

CODE:

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)
init = tf.global_variables_initializer() 

my_sess = tf.Session()
my_sess.run(init)
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters, my_sess)
#my_sess.run(parameters)  # Do I need to run this? Or is it obsolete?

cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: minibatch_X, 
                                            Y: minibatch_Y})

print(minibatch_cost)
my_sess.close()

ERROR:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-321-135b9fc18268> in <module>()
     16 cost = compute_cost(ZL, Y, my_sess, parameters, 3, 0.05)
     17 
---> 18 optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
     19 _ , minibatch_cost = my_sess.run([optimizer, cost], 
     20                                  feed_dict={X: minibatch_X, 

~/.local/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
    362           "No gradients provided for any variable, check your graph for ops"
    363           " that do not support gradients, between variables %s and loss %s." %
--> 364           ([str(v) for _, v in grads_and_vars], loss))
    365 
    366     return self.apply_gradients(grads_and_vars, global_step=global_step,

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'weights/W1:0' shape=(4, 5) dtype=float32_ref>", "<tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>", "<tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>", "<tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>", "<tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>", "<tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>"] and loss Tensor("Add_3:0", shape=(), dtype=float32).

Note that if I run

print(tf.trainable_variables())

just before the "optimizer = ..." line, I actually see my trainable variables there.

hts/W1:0' shape=(4, 5) dtype=float32_ref>, <tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>, <tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>, <tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>, <tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>, <tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>]

Would anyone have an idea about what can be the problem?

EDITING and ADDING SOME MORE INFO: In case you would like to see how I create & initialize my parameters, here is the code. Maybe there is sth wrong with this part but I don't see what..

def get_nn_parameter(variable_scope, variable_name, dim1, dim2):
  with tf.variable_scope(variable_scope, reuse=tf.AUTO_REUSE):
    v = tf.get_variable(variable_name, 
                        [dim1, dim2], 
                        trainable=True, 
                        initializer = tf.contrib.layers.xavier_initializer())
  return v


def initialize_layer_parameters(num_units_in_layers):
    parameters = {}
    L = len(num_units_in_layers)

    for i in range (1, L):
        temp_weight = get_nn_parameter("weights",
                                       "W"+str(i), 
                                       num_units_in_layers[i], 
                                       num_units_in_layers[i-1])
        parameters.update({"W" + str(i) : temp_weight})  
        temp_bias = get_nn_parameter("biases",
                                     "b"+str(i), 
                                     num_units_in_layers[i], 
                                     1)
        parameters.update({"b" + str(i) : temp_bias})  

    return parameters

#

ADDENDUM

I got it working. Instead of writing a separate answer, I am adding the correct version of my code here.

(David's answer below helped a lot.)

I simply removed the my_sess as parameter to my compute_cost function. (I could not make it work previously but seemingly it is not needed at all.) And I also reordered statements in my main function to call things in the right order.

Here is the working version of my cost function and how I call it:

def compute_cost(ZL, Y, parameters, mb_size, lambd):

    logits = tf.transpose(ZL)
    labels = tf.transpose(Y)

    cost_unregularized = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = logits, labels = labels))

    #Since the dict parameters includes both W and b, it needs to be divided with 2 to find L
    L = len(parameters) // 2

    list_sum_weights = []

    for i in range (0, L):
        list_sum_weights.append(tf.nn.l2_loss(parameters.get("W"+str(i+1))))

    regularization_effect = tf.multiply((lambd / mb_size), tf.add_n(list_sum_weights))
    cost = tf.add(cost_unregularized, regularization_effect)

    return cost

And here is the main function where I call the compute_cost(..) function:

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)

my_sess = tf.Session()
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters)

cost = compute_cost(ZL, Y, parameters, 3, 0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
init = tf.global_variables_initializer() 

my_sess.run(init)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: [[-1.,4.,-7.],[2.,6.,2.],[3.,3.,9.],[8.,4.,4.],[5.,3.,5.]], 
                                            Y: [[0.6, 0., 0.3], [0.4, 0., 0.7]]})


print(minibatch_cost)

my_sess.close()

score 3 · Accepted Answer · answered Mar 15 '18 at 01:56

3

I'm 99.9% sure you're creating your cost function incorrectly.

cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)

Your cost function should be a tensor. You are passing your session into the cost function, which looks like it's actually trying to run tensorflow session which is grossly in error.

Then later you're passing the result of compute_cost to your minimizer.

This is a common misunderstanding about tensorflow.

Tensorflow is a declarative programming paradigm, that means that you first declare all the operations you want to run, then later you pass data in and run it.

Refactor your code to strictly follow this best practice:

(1) Create a build_graph() function, in this function all of your math operations should be placed. You should define your cost function and all layers of the network. Return the optimize.minimize() training op (and any other OPs you might want to get back such as accuracy).

(2) Now create a session.

(3) After this point do not create any more tensorflow operations or variables, if you feel like you need to you're doing something wrong.

(4) Call sess.run on your train_op, and pass in the placeholder data via feed_dict.

Here's a simple example of how to structure your code:

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/neural_network_raw.ipynb

In general there are tremendously good examples put up by aymericdamien, I strongly recommend reviewing them to learn the basics of tensorflow.

answered Mar 15 '18 at 01:56

David Parks

30,789
47
185
328

David, thank you for your answer. I had a couple of questions based on your answer but could not manage to fit it in the comment window so I put it above as a separate answer. In case you have the possibility to have a look.. Thank you in advance! – edn Mar 15 '18 at 02:33
Do you think you can help me with following 3 questions? 1) Do you mean that I should never pass my tensorflow session as a function parameter to any function? 2) My dilemma is that I could not read weights in the compute_cost function. I could only use them with my_sess.run(..weight..) otherwise I was getting errors all the time. (See below regrding how I used it.) 3) Also.. Isn't my function returning a tensor below? – edn Mar 15 '18 at 04:44
Hi @edn, can you please move your answer to the question. You can always edit a question after it was asked (nice to clearly indicate that you're adding an addendum), but it's quite incorrect form on stack overflow to post a question as an answer (also it reduces the likelihood of having other people jump in with answers as it will look like a well addressed question). You should do that and delete the answer you posted to correct it. – David Parks Mar 15 '18 at 15:21
As for the code you posted it's looking correct to me. Are you encountering the same error message? Your cost function looks like it's built up correctly as far as I can tell. Try debugging by calling `tf.gradients(tf.get_collection(GraphKeys.TRAINABLE_VARIABLES))` to validate you can produce gradients for the variables. If not try getting that collection of variables and calling `tf.gradients` on each variable to see which one is causing the issue. Once you know which one it is I'd post another question directly specifying the variable that is problematic and we might be able to understand. – David Parks Mar 15 '18 at 15:24
Also, the best practice is to create the session just before this line `my_sess.run(init)`. While it won't change anything in your current code it'll likely keep you from making subtle but common mistakes in the future. The rule of thumb: no creating OPs after your session has been opened. – David Parks Mar 15 '18 at 15:28
Thank you for the heads up. I moved the correct version of my code as addendum to my question above. I hope the format looks correct now. I also tried to run your (David's) tips and it seems to be working. When I run: print(tf.gradients(cost, tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES))) It gives: [, ,..... But interesting to see that it does not give their names as W1, b1, etc. I can understand from given matrix sizes which one is which weight – edn Mar 15 '18 at 23:28
Another best practice, use the `name=...` property of your tensors so you don't have to do so much guess work during debugging. Debugging tensorflow is no walk in the park. :) I'm confused as to how you could call `tf.gradients` successfully but still get that error in the optimizer. Try using the optimizers `tf.apply_gradients` function. When you call `minimize` that's effectively just calling `tf.gradients` and `optimizer.apply_gradients` for you. I'm hoping something pops out at you in the process. – David Parks Mar 16 '18 at 01:46
1

I maybe caused a misunderstanding. I actually solved the problem and it is working now. I edited my first entry above and put the working version of my code at the bottom of my entry. And tf.gradients(...) call is working as well. On the other hand, I actually give names to my tensors (weights and biases). Please check the function "initialize_layer_parameters(...)" in the above code. Or are you referring another thing when you say I should give a name to my tensors? Regarding your comment on debugging tensorflow code.. Well, I cannot agree more.. :) – edn Mar 16 '18 at 02:10
Any idea what could be the problem [here](https://stackoverflow.com/questions/61249708/valueerror-no-gradients-provided-for-any-variable-tensorflow-2-0-keras)? – Stefan Falk Apr 20 '20 at 07:47

Tensorflow - No gradients provided for any variable

1 Answers1

Linked