pymc determine sum of random variables

Question

I have two independent Normal distributed random variables a, b. In pymc, it's something like:

from pymc import Normal


def model():
    a = Normal('a', tau=0.01)
    b = Normal('b', tau=0.1)

I'd like to know what's a+b if we can see it as a normal distribution, that is:

from pymc import Normal


def model():
    a = Normal('a', tau=0.01)
    b = Normal('b', tau=0.1)

    tau_c = Uniform("tau_c", lower=0.0, upper=1.0)
    c = Normal("a+b", tau=tau_c, observed=True, value=a+b)

Then I'd like to estimate tau_c, but it doesn't work with pymc because a and b are stochastic (if they are arrays it's posible, but I don't have observations of a or b, I just know their distributions).

A way I think I can do it, is generating random values by using the distributions of each a and b and then doing this:

def model(a, b):
    tau_c = Uniform("tau_c", lower=0.0, upper=1.0)
    c = Normal("a+b", tau=tau_c, observed=True, value=a+b)

But I think there's a better way of doing this with pymc.

Thanks!

When posting a question on stackoverflow you should try to show the others that you have at least made some effort to solve your problem. For example you could post a few lines of code, indicating were you find a barrier. Is not clear from your question if you do not know how to use PyMC3 or if you have a problem with the skew normal distribution. Have you checked the [PyMC3 starting guide](http://pymc-devs.github.io/pymc3/notebooks/getting_started.html)? You can update your questions by providing details. — aloctavodia, Sep 26 '16 at 12:17
So you do not have any data? not even mean values with know error? — aloctavodia, Sep 27 '16 at 20:56

aloctavodia · Accepted Answer · 2016-09-27T21:11:57.070

If I understood correctly your question and code you should be doing something simpler. If you want to estimate the parameters of the distribution given by the sum of a and b, then use only the first block in the following example. If you also want to estimate the parameters for the variable a independently of the parameter of variable b, then use the other two blocks

with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, sd=10)
    sd = pm.HalfNormal('sd', 10)
    alpha = pm.Normal('alpha', mu=0, sd=10)
    ab = pm.SkewNormal('ab', mu=mu, sd=sd, alpha=alpha, observed=a+b)

    mu_a = pm.Normal('mu_a', mu=0, sd=10)
    sd_a = pm.HalfNormal('sd_a', 10)
    alpha_a = pm.Normal('alpha_a', mu=0, sd=10)
    a = pm.SkewNormal('a', mu=mu_a, sd=sd_a, alpha=alpha_a, observed=a)

    mu_b = pm.Normal('mu_b', mu=0, sd=10)
    sd_b = pm.HalfNormal('sd_b', 10)
    alpha_b = pm.Normal('alpha_b', mu=0, sd=10)
    b = pm.SkewNormal('b', mu=mu_b, sd=sd_b, alpha=alpha_b, observed=b)

    trace = pm.sample(1000)

Be sure to use the last version of PyMC3, since previous versions did not include the SkewNormal distribution.

Update:

Given that you change your question:

If a and b are independent random variables and both are normally distributed then their sum is going to be normally distributed.

a ~ N(mu_a, sd_a²)

b ~ N(mu_b, sd_b²)

a+b ~ N(mu_a+mu_b, sd_a²+sd_b²)

that is you sum their means and you sum their variances (not their standard deviations). You don't need to use PyMC3.

If you still want to use PyMC3 (may be your distribution are not Gaussian and you do not know how to compute their sum analytically). You can generate synthetic data from your a and b distributions and then use PyMC3 to estimate the parameters, something in line of:

with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, sd=10)
    sd = pm.HalfNormal('sd', 10)
    ab = pm.Normal('ab', mu=mu, sd=sd, observed=a+b)
    trace = pm.sample(1000)

Please update your question and do not rewrite it completely, if you do then the comments and answers does not make sense. — aloctavodia, Sep 27 '16 at 20:53
Ah ok, it was just an example, but actually, a and b are SkewNormal distributed and that's why I need to do it in pymc. Thanks for your answer. ¿Do you know if generating synthetic data is the best way of doing it in pymc? — sebastian álvarez, Sep 27 '16 at 21:26
I can't think of any better solution for this. When posting questions try to use examples as close as possible to your real problem. You may want to check the book "The Skew-normal and Related Families, by Azzalini. — aloctavodia, Sep 27 '16 at 21:59

pymc determine sum of random variables

1 Answers1