3

I find myself wanting to perform arithmetic of random variables in Python; for the sake of example, let us consider the experiment of repeatedly tossing two independent fair coins and counting the number of heads.

Sampling from each random variable independently is straightforward with scipy.stats, and we can start getting results right away

In [5]: scipy.stats.bernoulli(0.5).rvs(10) + scipy.stats.bernoulli(0.5).rvs(10)
Out[5]: array([1, 0, 0, 0, 1, 1, 1, 2, 1, 2])

Now, a pessimist would remark that we wouldn't even have to go that far and could instead just do np.random.randint(2, size=10) + np.random.randint(2, size=10), and a cynic would notice that we could just calculate the sum and never have to sample anything.

And they'd be right. So, say that we have many more variables and more complex operations to perform on them, and graphical models quickly become useful. That is, we might want to operate on the random variables themselves and only start sampling when our graph of computation is set up. In lea, which does exactly that (albeit only for discrete distributions), the example above becomes

In [1]: from lea import Lea

In [7]: (Lea.bernoulli(0.5) + Lea.bernoulli(0.5)).random(10)
Out[7]: (0, 2, 0, 2, 0, 2, 1, 1, 1, 2)

Appears to be working like a charm. Enter PyMC3, one of the more popular libraries for probabilistic programming. Now, PyMC3 is intended for usage with MCMC and Bayesian modeling in particular, but it has the building blocks we need for our experiment above. Alas,

In [1]: import pymc3 as pm

In [2]: pm.__version__
Out[2]: '3.2'

In [3]: with pm.Model() as model:
   ...:     x = pm.Bernoulli('x', 0.5)
   ...:     y = pm.Bernoulli('y', 0.5)
   ...:     z = pm.Deterministic('z', x+y)
   ...:     trace = pm.sample(10)
   ...:
Assigned BinaryGibbsMetropolis to x
Assigned BinaryGibbsMetropolis to y
100%|███████████████████████████████████████| 510/510 [00:02<00:00, 254.22it/s]

In [4]: trace['z']
Out[4]: array([2, 0, 2, 0, 2, 0, 2, 0, 2, 0], dtype=int64)

Not exactly random. Unfortunately, I lack the theoretical understanding of why the Gibbs sampler produces this particular result (and really I should probably just hit the books). Using step=pm.Metropolis() instead, we get the correct distribution at the end of the day, even if the individual samples correlate strongly with their neighbours (as is to be expected from MCMC).

In [8]: with pm.Model() as model:
   ...:     x = pm.Bernoulli('x', 0.5)
   ...:     y = pm.Bernoulli('y', 0.5)
   ...:     z = pm.Deterministic('z', x+y)
   ...:     trace = pm.sample(10000, step=pm.Metropolis())
   ...:
100%|██████████████████████████████████████████████████████████████████████████████████████████| 10500/10500 [00:02<00:00, 5161.18it/s]

In [14]: collections.Counter(trace['z'])
Out[14]: Counter({0: 2493, 1: 5024, 2: 2483})

So, maybe I could just go ahead and use pm.Metropolis for simulating my post-arithmetic distribution, but I'd be afraid that I was missing something, and so the question finally becomes: Why does the step-less simulation above fail, and are there any pitfalls in using PyMC3 for ordinary, non-MC, MC, and is what I'm trying to do even possible in PyMC3 in the first place?

fuglede
  • 17,388
  • 2
  • 54
  • 99
  • I'm not finished debugging this, but preliminarily it looks like `BinaryGibbsMetropolis` subclasses `ArrayStep`, which seems to implement a behavior that multiple random variables are updated together. I *think* this means that there's a bug and somewhere this is assigning your variables `x` and `y` to *the same instance* of `BinaryGibbsMetropolis`, which is why they match each other at every random step (the Gibbs sampler updated the whole vector of variables, rather than updating components separately or treating the variables with two separate samplers). – ely Feb 16 '18 at 22:56
  • 1
    Even if I use something like this, I still see the same weird, fixed behavior: `pm.sample(10000, step=[pm.BinaryGibbsMetropolis([x]), pm.BinaryGibbsMetropolis([y])], random_seed=[109, 287])`. But if I even change just one of them to plain `BinaryMetropolis` or `Metropolis`, then it corrects the problem. How dare you do this to me on a Friday evening! – ely Feb 16 '18 at 23:07
  • @ely: Thanks for looking into this. So in particular you seem to be saying that the first attempt ought to have actually worked? – fuglede Feb 17 '18 at 09:35
  • Yes, I think it should work the same, and I cannot see any code in the `astep` member function of both `BinaryGibbsMetropolis` and `BinaryMetropolis` that would differ, so the fact that `BinaryMetropolis` has an acceptance rate of less than 100%, while `BinaryGibbsMetropolis` seems to have a 100% acceptance rate, seems impossible from the code, so something else must be going on. – ely Feb 17 '18 at 16:44
  • Shouldn't it be`pm.__version__` ? – Severin Pappadeux Feb 18 '18 at 19:19
  • @Severin Pappadeux: It should. – fuglede Feb 18 '18 at 20:21
  • Definitely a bug - https://github.com/pymc-devs/pymc3/issues/2866 . What you are doing *should* work, but is not the intention of the library. You would use PyMC3 to reason about uncertainty (perhaps observing `z` and reasoning about the probabilities of `x` and `y`). I think your first two approaches, and perhaps the `pomegranate` library would be more efficient. See https://stackoverflow.com/questions/46454814/how-to-simulate-a-biased-6-sided-dice-using-pymc3/46459429#46459429 – colcarroll Feb 21 '18 at 13:48
  • This is now fixed on master (see https://github.com/pymc-devs/pymc3/pull/2867) by Junpeng Lao. See http://andrewgelman.com/2018/01/18/measuring-speed-stan-incorrectly-faster-thought-cases-due-antithetical-sampling/ for background on "Anticorrelated draws". I am not sure how stackoverflow wants to handle a question like this. – colcarroll Feb 25 '18 at 20:10

1 Answers1

0

Comments by colcarroll:

[Feb. 21, 2018]: Definitely a bug - github.com/pymc-devs/pymc3/issues/2866 . What you are doing should work, but is not the intention of the library. You would use PyMC3 to reason about uncertainty (perhaps observing z and reasoning about the probabilities of x and y). I think your first two approaches, and perhaps the pomegranate library would be more efficient. See stackoverflow.com/questions/46454814/… –

[Feb. 25, 2018]: This is now fixed on master (see github.com/pymc-devs/pymc3/pull/2867) by Junpeng Lao. See andrewgelman.com/2018/01/18/… for background on "Anticorrelated draws". I am not sure how stackoverflow wants to handle a question like this.

Peter O.
  • 32,158
  • 14
  • 82
  • 96