0

I'm trying to figure out the code from the second part of this article (Q-learning + NN) https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0

1) Why do we start learning the network? Is it not easier to write targetQ[0,a[0]] in a matrix of weights? 2) Why after a network training W[s,[a0]] != targetQ[0,a[0]]? and as a consequence loss != 0

Maxim
  • 52,561
  • 27
  • 155
  • 209
Slava Mulyukin
  • 109
  • 3
  • 8
  • 1
    That comment should be put in the question instead of being a comment. Please edit your question and delete the comment. Also perhaps see [the help page on commenting](http://stackoverflow.com/help/privileges/comment). – PJvG May 04 '17 at 14:19
  • Have you tried to ask these questions to the author of that tutorial? – PJvG May 04 '17 at 14:20

0 Answers0