0

I have following data set which is panel data. Total data is about 78 millions rowcount. I have few more columns of data which I have skipped here.

                 date      stockName   PRC VOLUME
2 2016-06-01 09:30:53 ABCD IS Equity 14.25  13957
3 2016-06-01 09:30:54 EFGH IS Equity 14.25  14620
4 2016-06-01 09:31:04 IJKL IS Equity 14.25  14120
5 2016-06-01 09:31:11 MNOP IS Equity 14.25  13820
6 2016-06-01 09:31:47 ABCD IS Equity 14.30  20408
7 2016-06-01 09:31:58 EFGH IS Equity 14.30  29776

As far I understood plain biglm run is not for panel data. Please correct me if I am wrong. So how can I use it for panel data. Any kind of comments or suggestions are welcome.

Community
  • 1
  • 1
Zico
  • 185
  • 12
  • It's a pre-programming question where I am seeking advice for programming to run Panel data through biglm in R (non-linear) – Zico Sep 26 '16 at 11:30

2 Answers2

1

The page Econometrics at CRAN can give you an overview about the packages avaiable for econometric analysis.

As suggestion, I think lme4, nlme and even pglm may be the packages for what you are looking for: nonlinear panel data, despite I don't know much about their performance when you have too much rows.

Although they are written in the mixed-effect models jargon, the plm vignette gives brief comments about the interchangeability between this terminology and that used by econometricians.

Rodrigo Remedio
  • 640
  • 6
  • 20
  • yes thank you, I am also looking into same packages. But I am not sure about correct prediction power for large scale data. – Zico Sep 27 '16 at 14:42
  • If you are looking for fixed effects models (in the panel data/econometrics sense): for the amount of data you have, package `lfe` (command: `felm`) might come in handy as it has a C implementation for the data transformation which is fast. – Helix123 Sep 27 '16 at 17:19
0

If what you want to do is estimate a linear model with fixed effects, one possibility is to do it in two steps.

  • Step 1: Within-transformation of your outcome and covariates: build within-transformed variables X and Y by subtracting the individual average of each variable from itself

  • Step 2: Regression of the within-transformed variables using biglm

Roland
  • 377
  • 4
  • 14
  • I forgot to mention the model is non-linear – Zico Sep 26 '16 at 10:06
  • Could you indicate what the type of the outcome (binary, count,...) is? Also, it would be helpful to know what estimation command you would use if there was no dimension problem. – Roland Sep 27 '16 at 09:37
  • to be very specific, I am looking for `plm` like package -- check this [link](https://cran.r-project.org/web/packages/plm/index.html) or tool may be; for performing `panel data regression` through `biglm` in `R` – Zico Sep 27 '16 at 10:03