In Q-learning, how should I represent my Reward function if my Q-function is approximated by a normal Feed-Forward Neural Network?
Should I represent it as discrete values "near", "very near" to the goal etc.. All I'm what concerned about, is that as long as I already moved to the neural network approximation of the Q-function Q(s, a, θ)
and not using a lookup table anymore, would I still be obliged to build a Reward table as well?