I'm implementing PPO2 reinforcement learning on my self-build tasks and always encounter such situations where the agent seems to be nearly matured then suddenly catstrophically loses its performance and couldn't hold its stable performance. I don't know what's the right word for it.
I'm just wondering what could be the cause for such catastrophic drop in performance? Any hints or tips?
Many thanks