Could anybody please help me with designing state space graph for Markov Decision process of car racing example from Berkeley CS188.
For example I can do 100 actions and I want to run value iteration to get best policy to maximize my rewards.
When I have only 3 states (Cool, Warm and Overheated) I don't know how to add "End" state and complete MDP.
I am thinking about having 100 Cool states and 100 Warm states, and for example from Cool1 you can go to Cool2, Warm2 or Overheated and so on. In this example my values of states close to 0 are higher than states closed to 100.
Am I missing something in MDP?