For the MLP policy for controlling the autonomous control, if we only use the current observations (speed of the autonomous vehicle, speed of the preceding vehicle and the relative distance) or we have to use some other processing method before feeding the current observations to MLP policy. As the problem is partially observed and I am not sure if I can only use the current observations.
Asked
Active
Viewed 43 times
1 Answers
0
could you clarify the question a bit? Which scenario are you referring to? Technically you can use whatever observations you like for the MLP.

Eugene Vinitsky
- 56
- 1
- 3
-
In FLOW, I want to train an agent to control an autonomous vehicle. The observations are speeds of the autonomous vehicle, speed of the preceding vehicle and the relative distance. This is a partially observed MDP problem. I am a little bit confused. I am not sure if I can only use the current observations. – Yue Wang Jul 27 '19 at 18:02