Using pymc3 to fit Student's t distribution

Question

Not sure if I am doing something silly or pymc3 has a bug, but trying to fit T distribution to normal I get number of degrees of freedom (0.18 to 0.25, I'd expect something high, 4-5 at least). Of course I am getting the same error if I try T distribution with reasonable number of degrees of freedom, like 3 or 5.

import pymc3 as pm
Nsample = 200000
tst = np.random.normal(loc = 1e4, scale = 5e4, size = 250)
with pm.Model() as m:
    mean = pm.Normal('mean',mu=0,sd = 1e5)
    sigma = pm.Flat('sigma') # I tried uniform, gamma, exponential
    df = pm.Flat("df") # the same
    v = pm.T("pl",nu=df,mu = mean, lam = 1.0/sigma, observed = tst)

    start = {'df':5,'mean': 1e4, 'sigma':5e4} #start = pm.find_MAP()

    step = pm.Metropolis()
    trace = pm.sample(Nsample, step,start=start, progressbar=True)

pm.traceplot(trace[100000:],vars = ['df', 'sigma', 'mean']);

Could you suggest some fix (changing priors, sampling method )?.

score 1 · Accepted Answer · answered Oct 23 '14 at 12:56

1

Why would you expect to see df of around 4-5? A T distribution with df->inf equals a Normal distribution. When I run your model and do: print trace['df'][10000:].mean() I get 1.19876070951e+13, so something extremely large.

One reason you might see something different is that the Metropolis sampler is likely to fail if you're trying to sample in the joint space (which used to be the default in pymc3). If you haven't updated pymc3 from master recently, try updating and running the model again as Metropolis now defaults to be non-blocking and sample each variable separately.

answered Oct 23 '14 at 12:56

twiecki

1,316
8
11

It works great now, even if I replace tst with tst = np.random.standard_t(3.0,size=250)*20000+1e4 as I originally planned. Thanks! While I do not understand why previous version did not work. Back to study. – Alexey Goldin Oct 24 '14 at 03:35
NUTS still gives ridiculous answers (df ~ 0.22 ) – Alexey Goldin Oct 24 '14 at 03:45
Are you using `start = find_MAP(); sampler = NUTS(scaling=start)`? Those are almost always required. – twiecki Oct 25 '14 at 07:13
I apologize, NUTS works. My problem was that my prior was too constraining for scale parameter in Student T distribution, as I have not realized that scale parameter is ~1/square of the width of the square. Too restrictive prior on scale pushed degrees of freedom to very low values, as this was the only way to get such values in tails. NUTS works now, I had to replace v = pm.T("pl",nu=df,mu = mean, lam = 1.0/sigma, observed = tst) by v = pm.T("pl",nu=df,mu = mean, lam = 1.0/(sigma*sigma), observed = tst) and pm.Flat with pm.Uniform or pm.Exponential Thanks! – Alexey Goldin Oct 26 '14 at 00:06

Using pymc3 to fit Student's t distribution

1 Answers1