Eligibility traces in TensorFlow

Question

According to Sutton's book - Reinforcement Learning: An Introduction, the update equation of Network weights is given by:

where e_t is the eligibility trace. This is similar to a Gradient Descent update with an extra e_t.
Can this eligibility trace be included in the tf.train.GradientDescentOptimizer in TensorFlow?

I'm assuming `theta`, `e`, and `delta` all have the same shape, in which case you just want elementwise multiplication of the gradient? Which could be accomplished via an identity op with a custom gradient placed before the value of a variable is used. If that sounds right I can put together an example. — Allen Lavoie, Jun 06 '17 at 17:28
That's right. `theta`, `e`, and `delta` have the shape `[Number_of_classes, 1]` — nikpod, Jun 07 '17 at 04:39
Ah, apparently there's already [tf.contrib.layers.scale_gradient](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/layers/python/layers/layers.py#L1837). Does that work for you? — Allen Lavoie, Jun 07 '17 at 16:37

score 2 · Accepted Answer · answered Jun 07 '17 at 18:05

Here's a simple example of using tf.contrib.layers.scale_gradient to do elementwise multiplication of gradients. In the forward pass it's just an identity op, and in the backward pass it multiplies gradients by its second argument.

import tensorflow as tf

with tf.Graph().as_default():
  some_value = tf.constant([0.,0.,0.])
  scaled = tf.contrib.layers.scale_gradient(some_value, [0.1, 0.2, 0.3])
  (some_value_gradient,) = tf.gradients(tf.reduce_sum(scaled), some_value)
  with tf.Session():
    print(scaled.eval())
    print(some_value_gradient.eval())

Prints:

[ 0.  0.  0.]
[ 0.1         0.2         0.30000001]

Eligibility traces in TensorFlow

1 Answers1