Custom gradient for a chain of ops

Question

I've got a chain of standard TensorFlow operations, and I need to specify a custom gradient for this chain as a whole.

Say that, in the example below, these operations are grouped in a single Python function: 'my_op'. What I'm trying to do is to specify a custom gradient for 'my_op'. I had a look at RegisterGradient, gradient_override_map, and tf.Graph.create_op, but I couldn't find any simple example about how to use them to define a custom gradient for a group of ops without rewriting the full operation chain in C++.

import numpy as np
import tensorflow as tf

n = 2
m = 3
x = np.random.normal(size=(1, n))
A = tf.Variable(tf.truncated_normal(shape=(n, m), dtype=tf.float32))
b = tf.Variable(tf.zeros(shape=(1, m), dtype=tf.float32))

def my_op(a):
    return tf.add(tf.matmul(a, A), b)

x_placeholder = tf.placeholder(tf.float32,shape=[1, n])
t = my_op(tf.stop_gradient(x_placeholder))

grad = tf.gradients(t, [A])


sess = tf.Session()
sess.run(tf.initialize_all_variables())

result = sess.run(grad, feed_dict={x_placeholder: x})

print(result)

sess.close()

Perhaps example in [testFunctionGradientsWithGradFunc](https://github.com/tensorflow/tensorflow/blob/73ced9d797056c7e67a06ed2098dd809d85ec44a/tensorflow/python/ops/gradients_test.py#L351) is helpful — Yaroslav Bulatov, Jul 27 '16 at 23:37
Thanks @Yaroslav, but I'm not sure that I fully understand. Shall I decorate my_op using function.Defunc somehow? Would you be so kind to add an answer and edit the simple example that I constructed? — njk, Jul 28 '16 at 10:10

score 0 · Answer 1 · answered Jul 28 '16 at 07:53

0

As far as as I can see, the best way you can define a custom gradient(i.e. give some modification to the plain gradients) is to add a new custom ops in tensorflow following this. As you can see, for a custom op outputing the input, you can define the gradients of it in python by making use of @ops.RegisterGradient("MyOp").

answered Jul 28 '16 at 07:53

童学智

51
9

Thank you 童学智. I've seen that tutorial, but as I said I want to avoid writing my code in C++ because a combination of standard TensorFlow operations is sufficient for what I need. I just need to define a custom gradient for this chain of operations. – njk Jul 28 '16 at 08:05
Do you know how to define a custom operation without having to write it in C++? – njk Jul 28 '16 at 08:18
As far as the documents say, no. But it's not a big deal in C++ if you are just going to change the gradients. You can create a custom op based on [identity_op](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/identity_op.h). Also note that ops and kernels are not the same, you may use `REGISTER_OP` to register an op based on some existing kernels(but this is just an idea, I haven't given it a try). – 童学智 Jul 28 '16 at 08:35
You may also want to have a look at another successful [example](http://stackoverflow.com/questions/36204225/tensorflow-custom-op-gradient?rq=1) – 童学智 Jul 28 '16 at 08:58

Custom gradient for a chain of ops

1 Answers1