I've got a MDP problem with the following environment (3x4 map):
with the possible actions Up/Down/Right/Left and a 0.8 chance of moving in the right direction, 0.1 for each adjoining direction (e.g. for Up: 0.1 chance to go Left, 0.1 chance to go Right).
Now what I need to do is calculate the possible results starting in (1,1) running the following sequence of actions:
[Up, Up, Right, Right, Right]
And also calculate the chance of reaching a field (for each possible outcome) with this actions sequence. How can I do this efficiently (so not going through the at least 2^5, max 3^5 possible results)?
Thanks in advance!