I tried DoubleDQN and DQN algorithm on gym NChain game and realized that the performance of DoubleDQN was not more stable or better than DQN.
I set batch size of the training after each action taken to be 1. May I know this is the reason of DoubleDQN not outperforming DQN?