I am trying to model an ordinal predicted variable using PyMC3 based on the approach in chapter 23 of Doing Bayesian Data Analysis. I would like to determine a good starting value using find_MAP, but am receiving an optimization error.
The model:
import pymc3 as pm
import numpy as np
import theano
import theano.tensor as tt
# Some helper functions
def cdf(x, location=0, scale=1):
epsilon = np.array(1e-32, dtype=theano.config.floatX)
location = tt.cast(location, theano.config.floatX)
scale = tt.cast(scale, theano.config.floatX)
div = tt.sqrt(2 * scale ** 2 + epsilon)
div = tt.cast(div, theano.config.floatX)
erf_arg = (x - location) / div
return .5 * (1 + tt.erf(erf_arg + epsilon))
def percent_to_thresh(idx, vect):
return 5 * tt.sum(vect[:idx + 1]) + 1.5
def full_thresh(thresh):
idxs = tt.arange(thresh.shape[0] - 1)
thresh_mod, updates = theano.scan(fn=percent_to_thresh,
sequences=[idxs],
non_sequences=[thresh])
return tt.concatenate([[-1 * np.inf, 1.5], thresh_mod, [6.5, np.inf]])
def compute_ps(thresh, location, scale):
f_thresh = full_thresh(thresh)
return cdf(f_thresh[1:], location, scale) - cdf(f_thresh[:-1], location, scale)
# Generate data
real_ps = [0.05, 0.05, 0.1, 0.1, 0.2, 0.3, 0.2]
data = np.random.choice(7, size=1000, p=real_ps)
# Run model
with pm.Model() as model:
mu = pm.Normal('mu', mu=4, sd=3)
sigma = pm.Uniform('sigma', lower=0.1, upper=70)
thresh = pm.Dirichlet('thresh', a=np.ones(5))
cat_p = compute_ps(thresh, mu, sigma)
results = pm.Categorical('results', p=cat_p, observed=data)
with model:
start = pm.find_MAP()
trace = pm.sample(2000, start=start)
When running this, I receive the following error:
Applied interval-transform to sigma and added transformed sigma_interval_ to model.
Applied stickbreaking-transform to thresh and added transformed thresh_stickbreaking_ to model.
Traceback (most recent call last):
File "cm_net_log.v1-for_so.py", line 53, in <module>
start = pm.find_MAP()
File "/usr/local/lib/python3.5/site-packages/pymc3/tuning/starting.py", line 133, in find_MAP
specific_errors)
ValueError: Optimization error: max, logp or dlogp at max have non-finite values. Some values may be outside of distribution support. max: {'thresh_stickbreaking_': array([-1.04298465, -0.48661088, -0.84326554, -0.44833646]), 'sigma_interval_': array(-2.220446049250313e-16), 'mu': array(7.68422528308479)} logp: array(-3506.530143064723) dlogp: array([ 1.61013190e-06, nan, -6.73994118e-06,
-6.93873894e-06, 6.03358122e-06, 3.18954680e-06])Check that 1) you don't have hierarchical parameters, these will lead to points with infinite density. 2) your distribution logp's are properly specified. Specific issues:
My questions:
- How can I determine why dlogp is nan at certain points?
- Is there a different way that I can express this model to avoid dlogp being nan?
Also worth noting:
- This model runs fine if I don't find_MAP and use a Metropolis sampler. However, I'd like to have the flexibility of using other samplers as this model becomes more complex.
- I have a suspicion that the issue is due to the relationship between the thresholds and the normal distribution, but I don't know how to disentangle them for the optimization.