Highest Voted 'markov-decision-process' Questions

0

votes

1 answer

Influence diagrams / Decision models in Stan and PyMC3

Is it possible to write decision-making models in either Stan or PyMC3? By that I mean: we define not only the distribution of random variables, but also the definition of decision and utility variables, and determine the decisions maximizing…

asked Nov 17 '18 at 23:55

user118967

4,895
5
33
54

0

votes

1 answer

Confusion in understanding Q(s,a) formula for Reinforcement Learning MDP?

I was trying to understand the proof why policy improvement theorem can be applied on epsilon-greedy policy. The proof starts with the mathematical definition - I am confused on the very first line of the proof. This equation is the Bellman…

machine-learning artificial-intelligence reinforcement-learning markov-chains markov-decision-process

asked Sep 15 '18 at 12:24

Dhruv Chadha

1,161
2
11
33

0

votes

2 answers

Reinforcement Learning with MDP for revenues optimization

I want to modelize the service of selling seats on an airplane as an MDP( markov decision process) to use reinforcement learning for airline revenues optimization, for that I needed to define what would be: states, actions, policy, value and reward.…

python optimization reinforcement-learning markov-decision-process

asked Jun 07 '18 at 09:27

fatima-ezzahra elaamraoui

3
3

0

votes

1 answer

determine MDP from seen transitions

The following transitions has been seen in a markov decision process. try to determine it R A S′ S 0 U C B -1 L E C 0 D C A -1 R E C 0 D C A +1 R D C 0 U C B +1 R D C I need to find the states, transitions, rewards…

artificial-intelligence policy reinforcement-learning markov-decision-process

asked Apr 21 '18 at 14:17

Alaleh

1,008
15
27

0

votes

1 answer

Following action a from state s, is the outcome probablisitc or deterministic?

I am struggling to understand one aspect of the Markov Decison Process. When I am in state s and do action a, is it deterministic or stochastic to arrive in state s+1? In most examples it seems to be deterministic. However I found one example in the…

reinforcement-learning stochastic-process markov-decision-process

asked Nov 16 '17 at 15:18

siva

1,183
3
12
28

-1

votes

0 answers

Difference between Expectimax algorithm and Value Iteration Algorithm?

I have been learning about Search Problems, and have been struggling with the difference between the two. Search Problems invloving uncertainties can be modeled using MDPs, and it doesn't matter if we get a finite search tree or an infinite search…

artificial-intelligence markov-decision-process

asked Aug 25 '23 at 13:28

Rohan Manro

1

Questions tagged [markov-decision-process]

Influence diagrams / Decision models in Stan and PyMC3

Confusion in understanding Q(s,a) formula for Reinforcement Learning MDP?

Reinforcement Learning with MDP for revenues optimization

determine MDP from seen transitions

Following action a from state s, is the outcome probablisitc or deterministic?

Difference between Expectimax algorithm and Value Iteration Algorithm?