2

I'm trying to compare two models (example from Jake Vanderplas' blog) using PyMC3, but I can't get my modified code to work (the functions best_theta() and logL() are explained in Jake's blog post, which is available in IPython Notebook form):

degrees = [1, 2, 3]

# best_theta() finds the best set of parameters for a given model
thetas = [best_theta(d) for d in degrees]

n = len(degrees)
prob = np.array([ 1 for _ in degrees ])

# model specs
from pymc3 import Model, Dirichlet, Categorical, DensityDist

with Model() as bfactor:
    choices = Dirichlet('choices', prob, shape=prob.shape[0])

    choice = Categorical('choice', choices)

    indmodel = [0] * len(degrees)
    for i, d in enumerate(degrees):
        # logL() calculates the log-likelihood for a given model
        indmodel[i] = DensityDist('indmodel', lambda value: logL(thetas[i]))

    fullmodel = DensityDist('fullmodel', lambda value: indmodel[choice].logp(value))

That raises an exception, because the variable choice is an RV object, not an integer (unlike in PyMC2), as discussed in this question. In my code, however, the value of choice is important to make it work.

My question is, is there a way to access the value of the RV choice, or more generally set up a hierarchical model using Categorical random variables (i.e. use the value of a Categorical RV to calculate the log likelihood of another RV)?

LmnICE
  • 163
  • 1
  • 11

1 Answers1

3

I took a quick stab at this. However, the approach needed to be changed quite a bit as it's often more convenient to vectorize the model. This also revealed a bug which I fixed (https://github.com/pymc-devs/pymc3/commit/c784c478aa035b5113e175145df8751b0dea0df3) so you will need to update from current master for this to work.

Here is the full NB: https://gist.github.com/anonymous/c1ada3388a40ae767a8d

It doesn't seem to quite work yet as the results are not identical but it's a step into the right direction.

twiecki
  • 1,316
  • 8
  • 11
  • Hey, thanks! I like your solution, the model specification is clearer than mine. As you can probably tell, I'm just starting out with PyMC3. So, as best as I can tell, you can reference RV objects as you would their current values in the current MCMC step, but only within the context of another RV. In other words, your `coeffs[choice, :(choice+1)]` snippet is legal because `coeffs` is itself a (collection of) RV, whereas my `indmodel[choice]` snippet is illegal, because `indmodel` is a regular ol' list. Is that correct? – LmnICE Aug 27 '15 at 20:11