How to calculate Bayesian model selection with Python's pymcmcstat library

Question

I am lost within the pymcmcstat documentation of Python. I managed to plot the parameter distributions etc, but when it comes to the Bayes factor, I need to calculate the integral over the parameter space of likelihood for each model.

I followed this video. Each model has a different model function with different parameters. According to this link, I am supposed to compare the model evidences for model selection. All I have in my hand is the chain results after burnin that returns the distribution for each parameters, chain for sum-of-squares error (SSE) and variances. How do I compare the models with mcmc chain results I have?

Where do I go from here?

Here is my code for one model; for each model, the test_modelfun is changed and the chain results are saved for further comparison of different models;

# Data related lines: input omega and output fm
x = (np.array([76.29395, 152.5879, 305.1758, 610.3516, 1220.703,     2441.406, 4882.813, 9765.625, 19531.25, 39062.5, 78125, 156250, 312500, 625000]))
y = np.array([155.6412886 -63.3826188j , 113.9114436 -79.90544719j,  64.97809441-77.65152741j,  26.87482243-57.38474656j,    7.44462341-34.02438426j, 2.32954856-16.17918216j,    2.30747953 -6.72487436j,   3.39658859 -2.72444011j,    4.0084345  -1.2029167j ,   4.25877486 -0.70276446j, 4.11761329 -0.69591231j,    3.83339489 -0.65244854j,    3.47289164 -0.6079278j ,  3.07027319 -0.14914359j])
#import mcmc library and add data to the library in the second line below
mcstat = MCMC()
mcstat.data.add_data_set(x,y)
##define transfer function model calculated with theta parameters
def test_modelfun(xdata, theta):

    K, alpha_0, alpha_1, Tp_1, Tp_2, Tz_1 = 10**theta[0], 10**theta[1], 10**theta[2], 10**theta[3], 10**theta[4], 10**theta[5]
    #####################
    Pz_0 = (omega**(alpha_0))
    Pz_1 = (np.sqrt(((Tp_1**2)*(omega**(2*alpha_1))) + (2*Tp_1*(omega**alpha_1)*cos(alpha_1*pi/2)) +1))
    Pz_2 = (np.sqrt(((Tp_2**2)*(omega**(2*alpha_1))) + (2*Tp_2*(omega**alpha_1)*cos(alpha_1*pi/2)) +1))
    Zz_1 = (np.sqrt(((Tz_1**2)*(omega**(2*alpha_1))) + (2*Tz_1*(omega**alpha_1)*cos(alpha_1*pi/2)) +1))
    Pp_0 = np.array([(-1*pi*alpha_0)/2]*len(omega)).T#[0]
    Pp_1 = np.array([math.atan((Tp_1*(omega[i]**alpha_1)*sin(pi*alpha_1/2))/(1+(Tp_1*(omega[i]**alpha_1)*cos(pi*alpha_1/2)))) for i in range(len(omega))])
    Pp_2 = np.array([math.atan((Tp_2*(omega[i]**alpha_1)*sin(pi*alpha_1/2))/(1+(Tp_2*(omega[i]**alpha_1)*cos(pi*alpha_1/2)))) for i in range(len(omega))])
    Zp_1 = np.array([math.atan((Tz_1*(omega[i]**alpha_1)*sin(pi*alpha_1/2))/(1+(Tz_1*(omega[i]**alpha_1)*cos(pi*alpha_1/2)))) for i in range(len(omega))])
    #####################
    Z_est = (K*Zz_1)/(Pz_0*Pz_1*Pz_2)
    P_est = Zp_1 + Pp_0 - Pp_1 - Pp_2
    #####################
    R_est  = np.real([cmath.rect(Z_est[i], P_est[i]) for i in range(len(omega))])#abs()#[:,0]
    X_est  = np.imag([cmath.rect(Z_est[i], P_est[i]) for i in range(len(omega))])#abs()#[:,0]
    RX_est = (R_est + 1j*X_est)
    return RX_est
def modelfun(xdata, theta):
    ymodel = test_modelfun(xdata,theta)
    Zest = 20*log10(np.abs(ymodel))
    return Zest

##define sum of squares function for the error in evaluating the likelihood function L(Fobs(i)|q)
def test_ssfun(theta,data):
    xdata = data.xdata[0]
    ydata = data.ydata[0]
    ymodel = test_modelfun(xdata,theta)
    return (1/len(omega))*(sum((real(fm)- real(ymodel))**2 + (imag(fm)-imag(ymodel))**2))
    #sumsquares = sum((ymodel[:,0]-ydata[:,0])**2)
##import mcmc library and add data to the library in the second line below
itr = 50.0e4
verb = 1
wbar = 1
mcstat = MCMC()
mcstat.data.add_data_set(x,y)
## add model parameters
mcstat.parameters.add_model_parameter(name='th_1',theta0=1, minimum=-2,maximum=3) #m_k, M_k = -2, 3
mcstat.parameters.add_model_parameter(name='th_2',theta0=-1, minimum=-4,maximum=0) #m_a0, M_a0 = -4, 0
mcstat.parameters.add_model_parameter(name='th_3',theta0=-1, minimum=-3,maximum=0) #m_a1, M_a1 = -3, 0
mcstat.parameters.add_model_parameter(name='th_4',theta0=-4, minimum=-9,maximum=0) #m_p1, M_p1 = -9, 0
mcstat.parameters.add_model_parameter(name='th_5',theta0=-4, minimum=-9,maximum=0) #m_p2, M_p2 = -9, 0
mcstat.parameters.add_model_parameter(name='th_6',theta0=-4, minimum=-9,maximum=0) #m_z1, M_z1 = -9, 0
## define simulation options: mh=metropolis-hastings, am=adaptive  metropolis, dr=delayed rejection, dram=dr+am
mcstat.simulation_options.define_simulation_options(nsimu=int(itr), updatesigma=1, method='dr', adaptint=100, verbosity=verb, waitbar=wbar)
## define model settings
mcstat.model_settings.define_model_settings(sos_function=test_ssfun)
mcstat.run_simulation()
## extract results
results=mcstat.simulation_results.results
chain = results['chain']# chain for each parameter sampled during  simulation. s2
s2chain = results['s2chain']# chain for error variances. if      updatesigma=0 then s2chain is an empty list
sschain = results['sschain']# chain for sum-of-squares error calculated using each set of parameter values in the cahin
names = results['names']
burnin = int(itr/2)
## display chain statistics
mcstat.chainstats(chain[burnin:,:],results)
mcpl = mcstat.mcmcplot
figcp = mcpl.plot_chain_panel(chain, names, figsizeinches = (7,6))
axes = figcp.get_axes()
for ii, ax in enumerate(axes):
    ch = chain[:, ii]
    ax.plot([burnin, burnin], [ch.min(), ch.max()], 'r')
figpd = mcpl.plot_density_panel(chain[burnin:,:], names, figsizeinches=(7,6))
figpc = mcpl.plot_pairwise_correlation_panel(chain[burnin:,:], names, figsizeinches = (7,6))
mcstat.PI.setup_prediction_interval_calculation(results=results,     data=mcstat.data, modelfunction=modelfun, burnin=burnin)
mcstat.PI.generate_prediction_intervals(calc_pred_int=True, waitbar=False)
fg, ax = mcstat.PI.plot_prediction_intervals(adddata=True,  plot_pred_int=True, figsizeinches = (7,5), data_display=dict(color='k'))

Can you edit your question to show what you have at present? — halfer, May 30 '20 at 10:54
I tried to explain it in more details, but i could only do it as much as i could understand it. Please ask questions rather than closing the question. — Bsn_eng, May 30 '20 at 11:25
Thank you for adding the new detail. Can you show some data or code that you have already, so that readers might have a more concrete idea about how to do what you want? At present it's a bit "I'd like to calculate everything, etc, how can I do that", and based on my view of what constitutes an answerable question, it feels a bit broad. — halfer, May 30 '20 at 11:34
It may be that Stack Overflow isn't the best forum for this question, and Reddit would have something where loose/discussion questions are permitted. But please do try to fix it here if you can, readers want to help! — halfer, May 30 '20 at 11:35
I followed the video: https://www.youtube.com/watch?v=rmWlHpGfSio. Each model has a different model function with different parameters. According to this link: http://alumni.media.mit.edu/~tpminka/statlearn/demo/, I am supposed to compare the model evidences for model selection. All i have in my hand is the chain results after burnin that returns the distribution for each parameters, chain for sum-of-squares error (SSE) and variances. How do i compare the models with mcmc chain results i have? where do i go from here? — Bsn_eng, May 30 '20 at 11:53
^ Please put that in your question, but I still think readers would find it helpful to have actual data/code from your existing project. — halfer, May 30 '20 at 11:54
how about now? sorry i am new to the forum as well. Is it now visible to people to answer? — Bsn_eng, May 30 '20 at 12:20
It is visible, but it is "on hold", which means that comments can be added, but answers cannot. I think I will give the question the benefit of the doubt, and will cast a reopen vote. You need two more for the question to be able to accept answers. — halfer, May 30 '20 at 12:26

How to calculate Bayesian model selection with Python's pymcmcstat library

0 Answers0