I want to modelize the service of selling seats on an airplane as an MDP( markov decision process) to use reinforcement learning for airline revenues optimization, for that I needed to define what would be: states, actions, policy, value and reward. I thought a little a bit about it, but i think there is still something missing.
I modelize my system this way:
States = (r,c)
where r is the number of passengers and c the number of seats bought sor>=c
.Actions = (p1,p2,p3)
that are the 3 prices. the objective is to decide which one of them give more revenues.- Reward: revenues.
Could you please tell me what do u think and help me?
After the modelization, I have to implement all of that wit Reinforcement Learning. Is there a package that do the work ?