1

The dataset which I am using here is unbalanced, but I balanced it manually like this by removing the multiple observations for same ID (this is a characteristic of my data as a single household later split to different ones). T is 2 here.

 dataset %>% group_by(ID) %>% summarise(N =n()) %>% filter(N> 2 | N < 2) 

Then I removed these rogue observations.So now the panel is balanced.I converted them to pdata afterwards

  dataset <-plm.data(dataset, 30462)

And when I run is.pbalanced, it shows TRUE. But the problem is when I run the regression

 plm(DEP~ VAR1 + VAR2, data= dataset, model= "within")

The summary shows this

Unbalanced Panel: n=20236, T=1-2, N=34920

I don't understand what I am missing here. Any suggestions will be greatly appreciated.

Joe
  • 8,073
  • 1
  • 52
  • 58
Arya
  • 11
  • 1
  • Guess: DEP, VAR1 or VAR2 has missing values. – Otto Kässi Oct 19 '17 at 15:15
  • Compre to what is in `plm_object$model` to your original data so check which observations have been dropped in the estimation. Like Otto Kässi mentioned, likely there are `NA`s. – Helix123 Oct 19 '17 at 16:24
  • BTW: it is better to use `pdata.frame()` than `plm.data()` – Helix123 Oct 22 '17 at 07:36
  • Thanks for this! Yes there are almost 8000 NAs in the Dep variable. This is what is making this unbalanced. I forgot to take this into account. – Arya Oct 22 '17 at 09:10

0 Answers0