PLM package: Balanced data shown as unbalanced in regression

Asked Oct 19 '17 at 13:56

Active Dec 09 '18 at 22:38

Viewed 752 times

The dataset which I am using here is unbalanced, but I balanced it manually like this by removing the multiple observations for same ID (this is a characteristic of my data as a single household later split to different ones). T is 2 here.

 dataset %>% group_by(ID) %>% summarise(N =n()) %>% filter(N> 2 | N < 2)

Then I removed these rogue observations.So now the panel is balanced.I converted them to pdata afterwards

  dataset <-plm.data(dataset, 30462)

And when I run is.pbalanced, it shows TRUE. But the problem is when I run the regression

 plm(DEP~ VAR1 + VAR2, data= dataset, model= "within")

The summary shows this

Unbalanced Panel: n=20236, T=1-2, N=34920

I don't understand what I am missing here. Any suggestions will be greatly appreciated.

edited Dec 09 '18 at 22:38

Joe

8,073
1
52
58

asked Oct 19 '17 at 13:56

Arya

Guess: DEP, VAR1 or VAR2 has missing values. – Otto Kässi Oct 19 '17 at 15:15
Compre to what is in `plm_object$model` to your original data so check which observations have been dropped in the estimation. Like Otto Kässi mentioned, likely there are `NA`s. – Helix123 Oct 19 '17 at 16:24
BTW: it is better to use `pdata.frame()` than `plm.data()` – Helix123 Oct 22 '17 at 07:36
Thanks for this! Yes there are almost 8000 NAs in the Dep variable. This is what is making this unbalanced. I forgot to take this into account. – Arya Oct 22 '17 at 09:10

PLM package: Balanced data shown as unbalanced in regression

0 Answers0