Can someone point me to the docs that will explain what I'm seeing?
Pink stuff in a Jupyter notebook makes me think something is wrong.
Using PyMC3 (btw, it's an exercise for a class and I have no idea what I'm doing).
I plugged in the numbers, initially got an error about 0s on the diagonal, swapped alpha_est
and rate_est
to be 1/alpha_est
and 1/rate_est
(and stopped getting the error), but I still get the pink stuff.
This code came with the exercise:
# An initial guess for the gamma distribution's alpha and beta
# parameters can be made as described here:
# https://wiki.analytica.com/index.php?title=Gamma_distribution
alpha_est = np.mean(no_insurance)**2 / np.var(no_insurance)
beta_est = np.var(no_insurance) / np.mean(no_insurance)
# PyMC3 Gamma seems to use rate = 1/beta
rate_est = 1/beta_est
# Initial parameter estimates we'll use below
alpha_est, rate_est
And then the code I'm supposed to add:
Should the pink stuff make me nervous or do I just say "No errors, move on"?
=======
The "zero problem"
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 110, in run
self._start_loop()
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 160, in _start_loop
point, stats = self._compute_point()
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/parallel_sampling.py", line 191, in _compute_point
point, stats = self._step_method.step(self._point)
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/step_methods/arraystep.py", line 247, in step
apoint, stats = self.astep(array)
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 130, in astep
self.potential.raise_ok(self._logp_dlogp_func._ordering.vmap)
File "/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/step_methods/hmc/quadpotential.py", line 231, in raise_ok
raise ValueError('\n'.join(errmsg))
ValueError: Mass matrix contains zeros on the diagonal.
The derivative of RV `alpha__log__`.ravel()[0] is zero.
"""
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
ValueError: Mass matrix contains zeros on the diagonal.
The derivative of RV `alpha__log__`.ravel()[0] is zero.
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
<ipython-input-14-36f8e5cebbe5> in <module>
13 g = pm.Gamma('g', alpha=alpha_, beta=rate_, observed=no_insurance)
14
---> 15 trace = pm.sample(10000)
/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, **kwargs)
435 _print_step_hierarchy(step)
436 try:
--> 437 trace = _mp_sample(**sample_args)
438 except pickle.PickleError:
439 _log.warning("Could not pickle model, sampling singlethreaded.")
/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, **kwargs)
967 try:
968 with sampler:
--> 969 for draw in sampler:
970 trace = traces[draw.chain - chain]
971 if (trace.supports_sampler_stats
/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/parallel_sampling.py in __iter__(self)
391
392 while self._active:
--> 393 draw = ProcessAdapter.recv_draw(self._active)
394 proc, is_last, draw, tuning, stats, warns = draw
395 if self._progress is not None:
/Local/Users/vlb/anaconda3/lib/python3.7/site-packages/pymc3/parallel_sampling.py in recv_draw(processes, timeout)
295 else:
296 error = RuntimeError("Chain %s failed." % proc.chain)
--> 297 raise error from old_error
298 elif msg[0] == "writing_done":
299 proc._readable = True
RuntimeError: Chain 0 failed.
is the "hint" in the instructions here telling me I should use 1/rate_est
?
You are now going to create your own PyMC3 model!
Use an exponential prior for alpha. Call this stochastic variable alpha_.
Similarly, use an exponential prior for the rate ( 1/ ) parameter in PyMC3's Gamma.
Call this stochastic variable rate_ (but it will be supplied as pm.Gamma's beta parameter). Hint: to set up a prior with an exponential distribution for where you have an initial estimate for of 0 , use a scale parameter of 1/0 .
Create your Gamma distribution with your alpha_ and rate_ stochastic variables and the observed data.
Perform 10000 draws.
The zero problem could be because you are sampling zeros from exponential distribution.
Ah:
rate_est is 0.00021265346963636103
rate_ci = np.percentile(trace['rate_'], [2.5, 97.5]) rate_ci = [0.00022031, 0.00028109]
1/rate_est is 4702.486170152818
I can believe I am sampling zeros if I use rate_est
.