2

I am attempting to run a pooled logistic regression with panel data and a binary dependent variable. Since I wanted to lag some of the variables, I used the plm package to create them. When I tried to do it other ways, I ran into problems. I can't use lag or embed, because it is panel data.

hybridsubsidies <-pdata.frame(reduced, c("state","year"))

lagee<-(lag(hybridsubsidies$eespending,1))
lagratio<-(lag(hybridsubsidies$ratio, 1))
laggopvote<-(lag(hybridsubsidies$gopvote, 1))
laggasoline<-(lag(hybridsubsidies$gasoline, 1))

I wanted to put all the variables into the original data frame (hybridsubsidies) before I ran the pooled analysis. I'm pretty sure I don't need to, but I'm a visual person, and would like to verify the format of the data is appropriate before running any analysis.

From the output below, it looks like everything is done correctly.

head(lag(hybridsubsidies$eespending,1))

ALABAMA-1999 ALABAMA-2000 ALABAMA-2001 ALABAMA-2002 ALABAMA-2003 ALABAMA-2004

     NA        58294        55378        26982        28264         2566 

head(hybridsubsidies$eespending)

ALABAMA-1999 ALABAMA-2000 ALABAMA-2001 ALABAMA-2002 ALABAMA-2003 ALABAMA-2004

  58294        55378        26982        28264         2566        26906 

My problem is that when I try and assign this lag variable as a vector in the data frame, this way,

hybridsubsidies$lagee<-(lag(hybridsubsidies$eespending,1))

it does so(when I call the names in the dataframe, they are included), but then I can no longer view the dataframe. R says to me:

Error in edit.data.frame(get(subx, envir = parent), title = subx, ...) : can only handle vector and factor elements

How can I solve this so that I can view the data frame before I run the analysis? I want to look at it, since it looks like I will have to use glm instead of plm (pooling) for this analysis since the dependent variable is a binary variable and plm does not support these d.v.'s

This has been giving me problems for awhile now.

col1 ST YR EELAG EE

[1,] 1 1 NA 58294

[2,] 1 2 58294 55378

[3,] 1 3 55378 26982

[4,] 1 4 26982 28264

[5,] 1 5 28264 2566

[6,] 1 6 2566 26906

[7,] 1 7 26906 29466

[8,] 2 1 NA 355

[9,] 2 2 355 259

[10,] 2 3 259 224

[11,] 2 4 224 217

[12,] 2 5 217 241

[13,] 2 6 241 231

[14,] 2 7 231 231

[15,] 3 1 NA 5111

[16,] 3 2 5111 3753

[17,] 3 3 3753 2211

[18,] 3 4 2211 1452

[19,] 3 5 1452 2913

[20,] 3 6 2913 3128

[21,] 3 7 3128 7132

[22,] 4 1 NA 1597

[23,] 4 2 1597 905

Community
  • 1
  • 1
Alison
  • 68
  • 1
  • 6
  • I don't think that lag is doing what you think - look at the border between different states or years. – hadley Jul 28 '10 at 18:08
  • Hadley, I checked the data frame, and it looks like it is doing what I want it to. please see above. – Alison Jul 29 '10 at 13:18
  • I thought that if I was using the pdata.frame from the plm package, then the data frame is converted into a times series object. Since it has the unit and time id, I think lag is doing what I need it to do. – Alison Jul 29 '10 at 13:51

1 Answers1

0

lag returns a time series object. Does

hybridsubsidies$lagee<-(as.vector(lag(hybridsubsidies$eespending,1)))

work?

deinst
  • 18,402
  • 3
  • 47
  • 45
  • why yes it does. i thought it had to do with the class of the objects, but couldn't figure it out. that really helps, thanks! – Alison Jul 28 '10 at 16:51
  • when you look at the help on functions always pay attention to the return types, and coerce them if necessary. R sometimes coerces automatically, and sometimes not. Note: I am not sure why `class(lag(v,1))` returns integer if v is an integer. I would expect it to return ts. Perhaps someone with more knowledge than I can say. – deinst Jul 28 '10 at 17:17
  • i just looked to see what type of data that that command returns (without including as.vector in the command), and it returns a 'double,' which is a vector... so i'm not sure as to why i would have to set the lagged variable as a vector. when i checked the type of the object when i included as.vector in the command, it returned 'double' still. hopefully someone can explain this – Alison Jul 29 '10 at 01:26
  • if you display the data, you will see that it is not in fact a vector. You can also see this by using is.vector() You should also take note of Hadley's comment that lag probably isn't doing what you think it is. – deinst Jul 29 '10 at 01:39
  • Ok, now I understand. A time series object is just a vector, but with the attribute of `tsp` added. I'm not sure why it does not add an attribute of `class`, but it doesn't. All lag does is modify the `tsp` attribute. – deinst Jul 29 '10 at 01:59