Reinforcement learning with function approximation and eligibility traces

Asked Jun 12 '18 at 08:05

Active Jun 12 '18 at 08:05

Viewed 132 times

I'm currently thinking of doing TD(λ) for a DQN network. I know how to implement if it's a table (you update Q(s,a) and e(s,a) for all state and action pairs), but what happens when the Q value is now retrieved from a function approximator (neural network)? How would I update for all states as well as do the increments and decay for eligibility traces?

I've found 2 papers that might be related, but they don't really explain how to implement, rather showing the results only. PDF Link 1 PDF Link 2

asked Jun 12 '18 at 08:05

Andy Wei

Reinforcement learning with function approximation and eligibility traces

0 Answers0