2

I am trying to implement TD-Gammon, as described in this paper, which uses the TD-Lambda learning algorithm . This has been done already here, but it is 4 years old and doesn't use Tensorflow 2. I am trying to do this in Tensorflow 2 and think I need to create a custom optimizer to perform the weight change as described in the paper linked above.

I know that to create a custom optimizer, you need to subclass the Optimizer class and implement the create_slots, resource_apply_dense, resource_apply_sparse, and get_config methods. However, the weight change algorithm for TD-Lambda requires the neural network outputs (Y_t-1 and Y_t in the paper) and the resource_apply_dense method doesn't seem to have access to that.

How do I access the neural network outputs? Or am I just going about this the wrong way?

APhillips
  • 1,175
  • 9
  • 17
kman99
  • 21
  • 2

0 Answers0