I have a Stochastic Optimal Control problem that I wish to solve, using some type of Bayesian Simulation based framework. My problem has the following general structure:
s_t+1 = r*s_t(1 - s_t) - x_t+1 + epsilon_t+1
x_t+1 ~ Beta(u_t+1, w_t+1)
u_t+1 = f_1(u_t,w_t, s_t, x_t)
w_t+1 = f_2(u_t,w_t, s_t, x_t)
epsilon_t ~ Normal(0,sigma)
objective function: max_{x_t} E(Sigma_{t=0}^{T} V(s_t,x_t,c) * rho^t)
My goal is to explore different functional forms of f_1, f_2, and V to determine how this model differs w.r.t a non-stochastic model and another simpler stochastic model.
State variables are s_t, control variables are x_t with u_t and w_t representing some belief of the current state. The objective function is the discounted maximum from gains (function V) over the time period t=0 to t=T.
I was thinking of using Python, specifically PyMC to solve this, though I am not sure how to proceed, specifically how to optimize the control variables. I found a book, published 1967, Optimization of Stochastic Systems by Masanao Aoki, that references some bayesian techniques that may be useful, is there a current Python implementation that may help? Or is there a much better way to simulate a optimal path, using Python?