I'm an absolute newbie to Bayesian stats and MCMC, so I'm working my way through "Doing Bayesian Data Analysis: A Tutorial with R and BUGS" by John Kruschke. To test my understanding, I'm trying to translate his examples from BUGS to PyMC.
In Ch. 8, he observes a sequence of flips from each of two (potentially biased) coins, and tries to estimate the difference their biases. Below the bias for each coin is theta
, and the observed flips are y
:
import numpy as np
import pymc
priorA = 3
priorB = 3
theta1 = pymc.Beta('theta1', alpha=priorA, beta=priorB)
theta2 = pymc.Beta('theta2', alpha=priorA, beta=priorB)
y1 = pymc.Bernoulli('y1', p=theta1, value=np.array([1,1,1,1,1,0,0]), observed=True)
y2 = pymc.Bernoulli('y2', p=theta2, value=np.array([1,1,0,0,0,0,0]), observed=True)
The two coins are independent, of course. If I simulate them separately, then look at the difference in thetas, I get the same answer as the book. (It also agrees with the analytical solution and integration over a grid.)
m1 = pymc.MCMC([y1, theta1])
m1.sample(iter=10000, burn=1000, thin=2)
m2 = pymc.MCMC([y2, theta2])
m2.sample(iter=10000, burn=1000, thin=2)
dt = m1.trace('theta1')[:] - m2.trace('theta2')[:]
print np.average(dt), pymc.utils.hpd(dt, alpha=0.05)
# 0.23098692189 [-0.15303663 0.55686801]
If, on the other hand, I try to simulate the two coins at the same time, as part of the same model, I get a fairly different answer, in poor agreement with any other method.
model = pymc.MCMC([y1, theta1, y2, theta2])
model.sample(iter=10000, burn=1000, thin=2)
dt = model.trace('theta1')[:] - model.trace('theta2')[:]
print np.average(dt), pymc.utils.hpd(dt, alpha=0.05)
# 0.330061191424 [-0.09999254 0.7190196 ]
So my first question is, why? From what little I understand of the theory, my wild guess would be that MCMC is having a hard time exploring the two-parameter model adequately. (However, BUGS seems to handle it just fine.)
The really weird thing is that I've been doing all this in an iPython notebook, and there appears to be a bug in PyMC. If I run the independent coins model, restart my kernel (Kernel | Restart or File | Close and Halt), and then run the joint coins model, the joint coins will produce an answer similar (but not identical) to the independent coins (average dtheta ~ 0.23). If I run the models in the opposite order (restarting the kernel in between), they both produce the (incorrect) answer from the joint coins model, average dtheta ~ 0.33. I can only get the two models to produce different answers if I completely shut down my iPython notebook server in between. Since this would also unload all shared libraries from memory, I expect it means that the Fortran/C part of PyMC is caching something from these models in a memory location that's shared across Python interpreter instances. Versions are Numpy 1.8.2, PyMC 2.3.3, Python 2.7.8, iPython 2.1.0, Anaconda 2.0.0.
I'd really appreciate any help in understanding what's going on here. I realize these are silly, trivial models, but the odd behavior of PyMC is not inspiring confidence at the moment!