2

I am in the process of estimating the fixed effect of panel data using the Python statsmodel package.

First, the data used in the analysis include X and Y observed over time with several companies. Below are some examples from the actual data, but originally, there is a Balanced Panel of about 5,000 companies' one-year data.

| date       | firm | X1 | X2 | X3 | Y |
|:---------- |:----:|:--:|:--:|:--:|--:|
| 2021-01-01 | A    | 1  | 4  | 1  | 10|
| 2021-01-02 | A    | 2  | 7  | 0  | 21|
| 2021-01-03 | A    | 4  | 3  | 1  | 12|
| 2021-01-01 | B    | 2  | 1  | 0  | 4 |
| 2021-01-02 | B    | 3  | 7  | 1  | 9 |
| 2021-01-03 | B    | 7  | 1  | 1  | 4 |

When analyzing the fixed effect model that controlled the effect of the company with the code below, the results were well derived without any problems.

mod = PanelOLS.from_formula('Y ~ X1 + X2 + X3 + EntityEffects',
                            data=df.set_index(['firm', 'date']))
result = mod.fit(cov_type='clustered', cluster_entity=True)
result.summary

[out put]

this is PanelOLS outputs

However, the problem is that the effect of the intercept term is not printed on the result value, so I want to find a way to solve this problem.

Is there an option to force the intercept term to be output?

Kevin S
  • 2,595
  • 16
  • 22

1 Answers1

1

It is not very clear from the git but it looks like it is stored under result.estimated_effects. You should also mention it is from linearmodels, not statsmodels .

from linearmodels import PanelOLS
import pandas as pd

df = pd.DataFrame({'date':['2021-01-01','2021-01-02','2021-01-03',
'2021-01-01','2021-01-02','2021-01-03'],
'firm':['A','A','A','B','B','B'],
'X1':[1,2,4,2,3,7],'X2':[4,7,3,1,7,1],
'X3':[1,0,1,0,1,1],'Y':[10,21,12,4,9,4]})

df['date'] = pd.to_datetime(df['date'])

mod = PanelOLS.from_formula('Y ~ X1 + X2 + X3 + EntityEffects',
                            data=df.set_index(['firm', 'date']))

result = mod.fit(cov_type='clustered', cluster_entity=True)
result.estimated_effects



                 estimated_effects
firm date                         
A    2021-01-01           8.179545
     2021-01-02           8.179545
     2021-01-03           8.179545
B    2021-01-01           0.258438
     2021-01-02           0.258438
     2021-01-03           0.258438
StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Oh, my mistake... PanelOLS is coming from ‘linearmodel’ not ‘statsmodel’… thank you for answering. And could you explain a little more about the process of deriving Firms' Coefficient values? – Chocolate_coffee Nov 24 '21 at 15:47
  • 1
    you can read the model under https://bashtage.github.io/linearmodels/panel/panel/linearmodels.panel.model.PanelOLS.html#linearmodels.panel.model.PanelOLS, it's a bit out of the scope of this question, but you are basically fitting a dummy variable for each firm category – StupidWolf Nov 24 '21 at 16:22
  • 1
    it's the same as using the ols from statsmodels, `smf.ols('Y~0+firm+X1+X2+X3',data=df)` – StupidWolf Nov 24 '21 at 16:23
  • Thank you for your additional response. It was possible to estimate the effect of all companies like dummy variables in the way you taught me. Unfortunately, the estimates from the original fixed effect were completely different. – Chocolate_coffee Nov 24 '21 at 17:39
  • So what I thought of is another way to construct a new dataset, minus the mean value of each companies and take the same ols process. (cite: [link](https://matheusfacure.github.io/python-causality-handbook/13-Panel-Data-and-Fixed-Effects.html)) – Chocolate_coffee Nov 24 '21 at 17:40
  • In this case, the same estimator values and one constant value as the original fixed effect model were derived, but the constant did not produce a significant p- value, and it showed that the entire p-value was different from the original firm fixed effect model. – Chocolate_coffee Nov 24 '21 at 17:40
  • The last method I tried was analyzing the same fixed effect model using with Stata. The result shows similar estimator as Python’s analysis result and also constant value was well derived. Is there any way python can produce such a result?(here is the stata output I asked neighbor forum: [link](https://stats.stackexchange.com/questions/553555/questions-about-the-constant-value-of-the-fixed-effect-model-in-python-panelols)) – Chocolate_coffee Nov 24 '21 at 17:40
  • 1
    @Chocolate_coffee The solution is to add a constant to your model. Change the formula to `'Y ~ 1+ X1 + X2 + X3 + EntityEffects'` and you will get the constant too. The model fit is of course identical the only difference is in presentation. – Kevin S Nov 30 '21 at 12:15
  • @Kevin S Thank you for the information. Compared to the time I was contemplating, there was a really simple solution. Using the method, you taught me, it was possible to derive only the Constant value of the entire model that was omitted while all estimators and p-value values remained the same. – Chocolate_coffee Dec 01 '21 at 13:46