Im doing some experiments with a project utilizing acme with the tensorflow version. We wanted to do some additional experiments utilizing HER(Hindsight Experience Replay).
I have been working on including that but struggling to get it to work. Im looking two paths to try to implement this but am having issues with both so I wanted to get feedback or suggestions.
In one setup, we are building our own infrastructure, this includes a special environment and replay buffer that will implement HER within the run_episode section.
In the other I have been experimenting using the built in reverb buffer, but I am not super familiar with it.
If anyone has suggestions on either of these approaches or additional approaches I would appreciate it. I have quite a few different versions so if you have a specific question regarding this let me know and I can get the relevant code.