For deep q learning I can kind of imagine the neural net as the q table for normal q learning. So if for the q learning the q table is updated simultaneously, why cannot we use the same net for target q net and predict q net? I searched on google and someone said cause it's kind of like the net is chasing it's own tail, so it becomes unstable. That's kind of hard to understand, how does it become unstable? I mean, for normal q learning the q table works the same way, but it's stable.
I am confused.