0

I am trying to make the discrepancy plot for testing goodness-of-fit after obtaining best fit values by MCMC using pymc. My code goes as:

import pymc
import numpy as np
import matplotlib.pyplot as plt, seaborn as sns

# Seeding 
np.random.seed(55555)

# x-data
x = np.linspace(1., 50., 50)

# Gaussian function
def gaus(x, A, x0, sigma): 
        return A*np.exp(-(x-x0)**2/(2*sigma**2))

# y-data
f_true = gaus(x, 10., 25., 10.)
noise = np.random.normal(size=len(f_true)) * 0.2
f = f_true + noise

# y_error
f_err = f*0.05

# Defining the model
def model(x, f):
    A = pymc.Uniform('A', 0., 50., value = 12)
    x0 = pymc.Uniform('x0', 0., 50., value = 20)
    sigma = pymc.Uniform('sigma', 0., 30., value=8)

    @pymc.deterministic(plot=False)
    def gaus(x=x, A=A, x0=x0, sigma=sigma): 
        return A*np.exp(-(x-x0)**2/(2*sigma**2))
    y = pymc.Normal('y', mu=gaus, tau=1.0/f_err**2, value=f, observed=True)
    return locals()

MDL = pymc.MCMC(model(x,f))
MDL.sample(20000, 10000, 1)


# Extract best-fit parameters

A_bf, A_unc = MDL.stats()['A']['mean'], MDL.stats()['A']['standard deviation']
x0_bf, x0_unc = MDL.stats()['x0']['mean'], MDL.stats()['x0']['standard deviation'] 
sigma_bf, sigma_unc = MDL.stats()['sigma']['mean'], MDL.stats()['sigma']['standard deviation']

# Extract and plot results
y_fit = MDL.stats()['gaus']['mean']

plt.clf()
plt.errorbar(x, f, yerr=f_err, color='r', marker='.', label='Observed')
plt.plot(x, y_fit, 'k', ls='-', label='Fit')
plt.legend()
plt.show()

So far so good and gives the following plot:Best fit plot using MCMC

Now I want to test the goodness-of-fit using method as described in section 7.3 in https://pymc-devs.github.io/pymc/modelchecking.html. For this, I have to find f_sim first so I wrote following code after above lines:

# GOF plot
f_sim = pymc.Normal('f_sim', mu=gaus(x, A_bf, x0_bf, sigma_bf), tau=1.0/f_err**2, size=len(f))
pymc.Matplot.gof_plot(f_sim, f, name='f')
plt.show()

This gives error saying AttributeError: 'Normal' object has no attribute 'trace'. I am trying to use gof_plot before doing the discrepancy plot. I don't think using other distribution instead of Normal would be a good idea because of gaussian nature of the function. I would really appreciate if someone could let me know what I am doing wrong. Also Normal distribution in pymc doesn't have Normal_expval to get the expected values. Is there any other way that f_exp can be calculated? Thanks.

Silentrash
  • 45
  • 5
  • Some of your questions may be answered at http://stackoverflow.com/questions/30731681/goodness-of-fit-in-pymc-and-plotting-discrepancies. –  Aug 31 '15 at 21:36
  • I looked that before but I wasn't entirely sure how to fit in that description in the context of my problem. I think I finally figured it out though. I will post my solution now. Thanks anyway. – Silentrash Sep 01 '15 at 19:47

1 Answers1

0

I realized that f_sim is actually y values defined during the main fit since simulated values are the backbone of montecarlo method. So I extracted y values for last 10000 iterations and used gof_plot as follows:

f_sim = MDL.trace('gaus', chain = None)[:]
pymc.Matplot.gof_plot(f_sim, f, name='f')
plt.show()

Works great now! Still not sure how to get f_exp though.

Silentrash
  • 45
  • 5