Questions tagged [markov-decision-process]
51 questions
0
votes
1 answer
Influence diagrams / Decision models in Stan and PyMC3
Is it possible to write decision-making models in either Stan or PyMC3? By that I mean: we define not only the distribution of random variables, but also the definition of decision and utility variables, and determine the decisions maximizing…

user118967
- 4,895
- 5
- 33
- 54
0
votes
1 answer
Confusion in understanding Q(s,a) formula for Reinforcement Learning MDP?
I was trying to understand the proof why policy improvement theorem can be applied on epsilon-greedy policy.
The proof starts with the mathematical definition -
I am confused on the very first line of the proof.
This equation is the Bellman…

Dhruv Chadha
- 1,161
- 2
- 11
- 33
0
votes
2 answers
Reinforcement Learning with MDP for revenues optimization
I want to modelize the service of selling seats on an airplane as an MDP( markov decision process) to use reinforcement learning for airline revenues optimization, for that I needed to define what would be: states, actions, policy, value and reward.…
0
votes
1 answer
determine MDP from seen transitions
The following transitions has been seen in a markov decision process. try to determine it
R A S′ S
0 U C B
-1 L E C
0 D C A
-1 R E C
0 D C A
+1 R D C
0 U C B
+1 R D C
I need to find the states, transitions, rewards…

Alaleh
- 1,008
- 15
- 27
0
votes
1 answer
Following action a from state s, is the outcome probablisitc or deterministic?
I am struggling to understand one aspect of the Markov Decison Process.
When I am in state s and do action a, is it deterministic or stochastic to arrive in state s+1?
In most examples it seems to be deterministic. However I found one example in the…

siva
- 1,183
- 3
- 12
- 28
-1
votes
0 answers
Difference between Expectimax algorithm and Value Iteration Algorithm?
I have been learning about Search Problems, and have been struggling with the difference between the two.
Search Problems invloving uncertainties can be modeled using MDPs, and it doesn't matter if we get a finite search tree or an infinite search…