3

Using PyMC3 to perform bayesian linear regression. I built my model, and I want to predict the posterior for new X values using the same model. I have been attempting to follow the instructions at the documentation website: https://pymc-devs.github.io/pymc3/notebooks/posterior_predictive.html (see Prediction). This involves making your X values a theano shared variable before analysis, and then changing the values after model building, and running run_ppc(). I ran a quick 200 iterations just as an example (I'd run a lot more for actual analysis).

X1_shared = theano.shared(final_df['poll_diff'].values)
Y1 = final_df['rd_diff'].values

basic_model = pm.Model()
with basic_model:

    # Priors for unknown model parameters
    sigma = HalfCauchy('sigma', beta=10, testval=1.)
    intercept = Normal('Intercept', 0, sd=20)
    x_coeff = Normal('x', 0, sd=20)

    # Define likelihood
    likelihood = Normal('y', mu=intercept + x_coeff * X1_shared,
                        sd=sigma, observed= Y1)

    #start = find_MAP()
    start = find_MAP() # Find starting value by optimization
    step = NUTS(scaling=start) # Instantiate MCMC sampling algorithm
    trace = sample(200, step, start=start)
pm.traceplot(trace)
plt.show()

enter image description here

sns.lmplot(x="poll_diff", y="rd_diff", data=final_df, size=10)
x = np.array(range(-1, 2))
pm.glm.plot_posterior_predictive(trace, samples=100, eval=x)
plt.show()

enter image description here

X1_shared.set_value(ana_2016_df['poll_diff'].values)
ppc = pm.sample_ppc(trace, model=model, samples=100)

But I get the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-73-9c1eb48d987f> in <module>()
----> 1 ppc = pm.sample_ppc(trace, model=model, samples=100)

C:\Users\W\Anaconda3\lib\site-packages\pymc3\sampling.py in sample_ppc(trace, samples, model, vars, size, random_seed)
    349 
    350     if vars is None:
--> 351         vars = model.observed_RVs
    352 
    353     seed(random_seed)

AttributeError: module 'pymc3.model' has no attribute 'observed_RVs'

Notably, if I use the patsy notation version, without changing the variables, this error does not pop up, but I don't know how the patsy format would accept a theano shared variable. So a solution would either address my error message, or show how to introduce a theano shared variable into the patsy version of the model. Thanks!

Flow Nuwen
  • 547
  • 5
  • 20
  • 2
    I am not able to reproduce your error. I notice that your model name is `basic_model`, but then you use `ppc = pm.sample_ppc(trace, model=model, samples=100)`. Aren't you just mixing variables? (probably because you are working with a Jupyter notebook). – aloctavodia Nov 24 '16 at 00:27
  • Wow, that solved my problem. Thanks for taking the time to look at it, sometimes it helps to have another set of eyes! – Flow Nuwen Nov 24 '16 at 00:42

1 Answers1

2

As aloctavodia pointed out, this was a simple error in setting variables. In ppc = pm.sample_ppc(trace, model=model, samples=100), model should be model = basic_model

Flow Nuwen
  • 547
  • 5
  • 20