Iam trying to use DDPG (stable baseline3) to solve a problem.
I would like to know, how can we change the env sampled values with every episode "and it should be reproducible", using stable baseline.
for example, assume we have an env where we harvest energy, we assume that the harvested energy is normally distributed for example, and then in every episode, I will sample DIFFERENT Values of my harvested energy.I would just like to emphasize again, that I would like that the different values of my harvested energy should be reproducible, so I can compare the RL method to other methods. PS: using stable baseline