I'm trying to use TensorFlow's @tf.custom_gradient
functionality to assign a custom gradient to a function with multiple inputs. I can put together a working setup for only one input, but not for two or more.
I've based my code on TensorFlow's custom_gradient documentation, which works just fine for one input, as in this example:
import tensorflow as tf
import os
# Suppress Tensorflow startup info
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
# Custom gradient decorator on a function,
# as described in documentation
@tf.custom_gradient
def my_identity(x):
# The custom gradient
def grad(dy):
return dy
# Return the result AND the gradient
return tf.identity(x), grad
# Make a variable, run it through the custom op
x = tf.get_variable('x', initializer=1.)
y = my_identity(x)
# Calculate loss, make an optimizer, train the variable
loss = tf.abs(y)
opt = tf.train.GradientDescentOptimizer(learning_rate=0.001)
train = opt.minimize(loss)
# Start a TensorFlow session, initialize variables, train
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(train)
This example runs silently, then closes. No issues, no errors. The variable optimizes as expected. However, in my application, I need to do such a calculation with multiple inputs, so something of this form:
@tf.custom_gradient
def my_identity(x, z):
def grad(dy):
return dy
return tf.identity(x*z), grad
Running this in place of the example (and adding another variable input to the call of my_identify
) results in the following error output. Best as I can tell, the last parts of the error are from the dynamic generation of the op -- the information format matches the C++ formatting required in the op establishment (though that's about all I know about it).
Traceback (most recent call last):
File "testing.py", line 27, in <module>
train = opt.minimize(loss)
File "/usr/lib/python3/dist-packages/tensorflow/python/training/optimizer.py", line 400, in minimize
grad_loss=grad_loss)
File "/usr/lib/python3/dist-packages/tensorflow/python/training/optimizer.py", line 519, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/usr/lib/python3/dist-packages/tensorflow/python/ops/gradients_impl.py", line 630, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "/usr/lib/python3/dist-packages/tensorflow/python/ops/gradients_impl.py", line 821, in _GradientsHelper
_VerifyGeneratedGradients(in_grads, op)
File "/usr/lib/python3/dist-packages/tensorflow/python/ops/gradients_impl.py", line 323, in _VerifyGeneratedGradients
"inputs %d" % (len(grads), op.node_def, len(op.inputs)))
ValueError: Num gradients 2 generated for op name: "IdentityN"
op: "IdentityN"
input: "Identity"
input: "x/read"
input: "y/read"
attr {
key: "T"
value {
list {
type: DT_FLOAT
type: DT_FLOAT
type: DT_FLOAT
}
}
}
attr {
key: "_gradient_op_type"
value {
s: "CustomGradient-9"
}
}
do not match num inputs 3
Based on other custom gradient options, I surmised that the issue was a lack of supplied gradient for the second input argument. So, I changed my function to this:
@tf.custom_gradient
def my_identity(x, z):
def grad(dy):
return dy
return tf.identity(x*z), grad, grad
This results in the following more familiar error:
Traceback (most recent call last):
File "testing.py", line 22, in <module>
y = my_identity(x, z)
File "/usr/lib/python3/dist-packages/tensorflow/python/ops/custom_gradient.py", line 111, in decorated
return _graph_mode_decorator(f, *args, **kwargs)
File "/usr/lib/python3/dist-packages/tensorflow/python/ops/custom_gradient.py", line 132, in _graph_mode_decorator
result, grad_fn = f(*args)
ValueError: too many values to unpack (expected 2)
The @custom_gradient
decorator is only identifying the last returned element as a gradient. So, I tried putting the two gradients into a tuple as (grad, grad)
such that there would only be "two" outputs for the function. TensorFlow rejected this too, this time because it can't call a tuple like it would a Tensor -- entirely reasonable, in hindsight.
I've fussed around with the example some more, but to no avail. No matter what I try, I can't get the custom-defined gradient to deal with multiple inputs. I'm hoping that somebody with more knowledge than I regarding custom ops and gradients will have a better idea on this -- thanks in advance for the help!