I had been going through numerous articles of Reinforcement Learning - more specifically Q-Learning. The area where I'm stuck is how does it learns from past experiences? I came across a concept called experience-replay
where it actually learns from past experiences. But then the article would include neural nets. I'm a bit confused on this. Now, do we really need some neural nets to implement this experience-replay
?

- 99
- 2
- 10
1 Answers
Some reinforcement learning algorithms, such as Q-learning, learn from experiences (understanting a experience as a tuple <state, action, next_state, reward>
). If the experiences are collected previously or not, doesn't matter too much, in the sense that the learning principle is the same. So, you can collect the experiences and use them more than once, i.e., experience replay.
Experience replay can have several benefits such as speed up the learning process. Another benefit, which plays and important role when combining RL + neural networks, is that it stabilizes the learning process. Basically, during the learning process, when you train the network to learn some Q-values, it can "forget" the Q-values learnt in the past. In this case, if you store past experiencies and uses a set of them, you are forcing the network to learn all (the past and the new) Q-values.
This Stackoverflow response maybe it is useful to better understand why the neural network can forget the previous Q-values.

- 6,190
- 3
- 29
- 37