How to do panel data analysis in Bayesian model with pymc

Question

everyone. I have a question on how to do panel data analysis in Bayesian model with pymc. The data is like:

..........................................................
User    Time     x1          x2         x3           Y
1        1        1          1           3           2      
1        2        2          1           4           1
1        3        2          2           2           1
1        4        1          3           1           3
1        5        1          1           2           3
2        1        1          3           1           3  
2        2        1          1           2           2
2        3        2          3           1           0
2        4        1          2           2           3
2        5        1          1           1           2    
3        1        4          3           1           3  
3        2        3          1           3           2
3        3        2          3           2           2
3        4        2          1           2           3
3        5        1          1           1           2
4        1        1          1           3           2      
4        2        2          2           4           3
4        3        2          2           2           1
4        1        1          3           1           3
4        1        4          5           2           3  
.............   
..........................................................

Now, I have N-users on T-times samples (N≫T), as well as independent variables(x1,x2,x3) and dependent variable(Y).

Now, I want to analyze the X's impact on Y in collective-level. Take the most simple linear regression as example, follow the book of "Introduction to Bayesian Econometrics"(PP.145), the general model is often be written as:

$$ y_{it} = x_{it}{\beta}+ w_{it}{b_i}+ {u_{it}}, i = 1,...,n;\;\;t = 1,...,T $$

In which, $i$ indicates the user; $t$ represents the time; ${\beta}$ is not differ across $i$, called fixed effects; ${b_i}$ differs across $i$, called random effects.

In Bayesian opinion, both ${\beta}$ and ${b_i}$ are regarded as random variables. So, let ${\beta} $~$ N({\beta}_0,{\beta}_1)$, and ${b_i} $~$ N({\lambda_0},{\lambda_1})$

However, this is the general thought in theory, but I do not have any idea on how to model and fit it in pymc.

Thanks anyone give me some inspiration or example code.

score 0 · Answer 1 · answered Jan 25 '15 at 16:43

The following blog post contains a good example of fitting a linear regression using PyMC3. It also contains a short cut, using the glm module, which is particularly useful for those familiar with R syntax.

http://twiecki.github.io/blog/2013/09/12/bayesian-glms-1/

For your model, which is multivariate, you will want an x_coeff for each variable. The easiest way to do this is to pass 'size = 4' when calling Normal(). This will generate 4 stochastic variables, one for each variable in your data, and return it as an array.

you should copy the relevant parts of the answer in here in case the link ever dies — Scriptable, Jan 25 '15 at 16:48

How to do panel data analysis in Bayesian model with pymc

1 Answers1