pymc fitting with several observations a time

Question

I am trying to use pymc to fit the time evolution of oscillating data. Here I have not just one point per time step, but several of them.

I simply cannot find an efficient way to make this work in pymc3 as it always raises some input value errors. So I wondered if there is a good solution that would be known. I attached the code, but it can also be found as ipython notebook here here.

# coding: utf-8

# # Two level oscillation tests

# In[2]:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import numpy.linalg as npl
from IPython.core.pylabtools import figsize
get_ipython().magic('matplotlib inline')


# create an artificial data set with several points per time value

# In[3]:

import pymc3 as pm
sigma =0.2;
Omega =0.5;
Nt = 20;
tmax =2;
Nrep = 5;
tlin = np.linspace(0,tmax,Nt);
t_1 = tlin[:];
t_2 = tlin[:];
n1_simu = np.sin(2*np.pi*Omega*tlin)**2;
n2_simu = 1 - n1_simu;

n1_noise = 0.2*np.random.randn(Nt);
n2_noise = 0.2*np.random.randn(Nt);

n1_exp = n1_simu+n1_noise;
n2_exp = n2_simu+n2_noise;

for jj in np.arange(Nrep):
    n1_noise = 0.2*np.random.randn(Nt); 
    n2_noise = 0.2*np.random.randn(Nt);
    n2_exp = np.r_[n2_exp, n2_simu+n2_noise]
    n1_exp = np.r_[n1_exp, n1_simu+n1_noise]
    t_1 = np.r_[t_1, tlin]
    t_2 = np.r_[t_2, tlin]

nt_exp = np.r_[n1_exp, n2_exp];
t_all = np.r_[t_1, t_2];
plt.figure(1)
plt.clf;
plt.plot(t_1,n1_exp, 'o');
plt.plot(t_1,n2_exp, 'o');
plt.xlabel('t')
plt.ylabel('population')


# Now that we have the simulated datas let us simulate them with pymc. 
# 
# The key is to put the mean value function into the Deterministic     symbol, then pymc unstands that it is supposed to be a variable.

# In[4]:

basic_model = pm.Model()


with basic_model:
    # Priors for unknown model parameters
    sigma = pm.HalfNormal('sigma', sd=1)
    Omega = pm.Normal('omega', mu=0.55, sd=0.1)
    amp = pm.Normal('Amplitude', mu=0.55, sd=0.1)


    # Expected value of outcome
    n1 = amp*pm.sin(2*np.pi*Omega*t_1)**2
    n2 = 1-n1
    Nval = len(nt_exp)
    Nswitch = len(n1_exp)
    idx = np.arange(Nval)
    if n1.shape:
        print(n1.shape)
        rate = pm.switch(Nswitch>= idx, np.r_[n1, n1], np.r_[n2, n2])

    # Likelihood (sampling distribution) of observations
    Y_obs = pm.Normal('Y_obs', mu=rate, sd=sigma, observed=nt_exp)


    # now sample it

    # In[ ]:

    Nsamples  =5000
    with basic_model:
    # obtain starting values via MAP
    start = pm.find_MAP()

    # instantiate sampler
    step = pm.NUTS(scaling=start)

    # draw 500 posterior samples
    trace = pm.sample(Nsamples, step, start=start)

This is hardly enough information to provide an answer. Can you post some code that replicates your problem? — twiecki, Aug 27 '15 at 11:36

score 0 · Answer 1 · answered Aug 28 '15 at 16:31

0

The key is to just put in a y-value for every x-value you have. If you multiple points for each x-value, put in multiple x,y-pairs.

There were some other issues with the model which I fixed here: https://gist.github.com/3ab2cba82e3cb5291e38

Hope I understood what you tried to correctly. Essentially this is a mixture model inspired by https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/gaussian_mixture_model.ipynb.

answered Aug 28 '15 at 16:31

twiecki

1,316
8
11

Thanks for the answer, however I think that your solution is not exactly what I was looking for. The two signals are actually supposed to be physically different. Signal 1, comes from detector 1, signal 2 comes from detector 2. So I know the category, and do not have to guess it anymore. But now I want to fit the signal from both detectors. This seems quite different to the mixture example to me. I only concenvanted n1 and n2, to be able to pass them to pymc, but maybe this was the wrong way to go. – fretchen Aug 31 '15 at 07:57
You can just remove the mixture in that case, replace it with 0s and 1s to get the indexing correct and the code I posted should work seamlessly. – twiecki Sep 01 '15 at 11:43
Hi, thanks so much for your help, but I am still lost. I tried to modify the code, but I am not sure what you suggested exactly. So, I tried to remove the category and replace it by indexing as shown [here](https://gist.github.com/fretchen/12708bb4af1455787764). However, I always get an error that the index is not integer, which does not make much sense me. I still can't wrap my head about efficient debugging for theano code, as it seems hard to figure out shapes etc. – fretchen Sep 11 '15 at 12:17
I was able to get things to run by adding `dtype='i16'` to the ones and zeros index: https://gist.github.com/a9e30da49f513c31f5c1 – twiecki Sep 12 '15 at 13:45

pymc fitting with several observations a time

1 Answers1