0

I am new to R and moved from SAS.
I have Time Series Cross Sectional Data with 24 months for each ID. snapshot like

ID Time Var  
1 201201 2.5  
1 201202 3.2  
1 201203 4.1  
1 201204 3.2  
1 201205 4.1  
2 201201 1.8  
2 201202 5.6  
2 201203 4.5  
2 201204 9.2  
2 201205 8.1   

Now, I have to create Var1, Var2, var3, var4 and var5 with 5 lags where var1 will be lagged with 1, var 2 will have lag of 2 and so and so for.. 'Slide' function for DataCombine library can do this job but I can't install R 2.15.3 or above and Slide is for >=2.15.3.

Could you please help me in solving this problem? in SAS, I could have done this using Proc Panel but I don't know how to do it in R.

Netloh
  • 4,338
  • 4
  • 25
  • 38
Jyotendra
  • 3
  • 1
  • Henrik, answer has really helped, Thanks a lot. However, when I am using this 'code' on my dataset, I am getting an error(probably due to size of the dataset and number of variables). I am trying to fix that by reading more about zoo and ddply :( – Jyotendra Nov 10 '13 at 14:42
  • This is possible duplicate: http://stackoverflow.com/questions/1971461/generating-a-lagged-time-series-cross-sectional-variable-in-r?rq=1 – Maximilian Nov 11 '13 at 10:24
  • Yes Max, I agree with you that question seems to be duplicated. however, I was trying to create 5 lags and I was not able to create 5 lags using the solution there. – Jyotendra Nov 11 '13 at 10:34

2 Answers2

1

Maybe you're looking for embed?

# copy sample data to clipboard
df <- read.table(text=readClipboard(), header=TRUE)
embed(df$Var, 5)
#      [,1] [,2] [,3] [,4] [,5]
# [1,]  4.1  3.2  4.1  3.2  2.5
# [2,]  1.8  4.1  3.2  4.1  3.2
# [3,]  5.6  1.8  4.1  3.2  4.1
# [4,]  4.5  5.6  1.8  4.1  3.2
# [5,]  9.2  4.5  5.6  1.8  4.1
# [6,]  8.1  9.2  4.5  5.6  1.8
Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113
1

If you want to lag within each ID, you may try this:

library(plyr)
library(zoo)
df2 <- ddply(.data = df, .variables = .(ID), function(x){
  lag(zoo(x$Var), k = 0:4)
})
df2
#    ID lag0 lag1 lag2 lag3 lag4
# 1   1  2.5  3.2  4.1  3.2  4.1
# 2   1  3.2  4.1  3.2  4.1   NA
# 3   1  4.1  3.2  4.1   NA   NA
# 4   1  3.2  4.1   NA   NA   NA
# 5   1  4.1   NA   NA   NA   NA
# 6   2  1.8  5.6  4.5  9.2  8.1
# 7   2  5.6  4.5  9.2  8.1   NA
# 8   2  4.5  9.2  8.1   NA   NA
# 9   2  9.2  8.1   NA   NA   NA
# 10  2  8.1   NA   NA   NA   NA
Henrik
  • 65,555
  • 14
  • 143
  • 159
  • Hey, hope I can get some further insights after all those years. I just used this code for my data a couple of hours ago and it worked perfectly. Now I wanted to run it again but I get the following error message: "Error in list_to_dataframe(res, attr(.data, "split_labels"), .id, id_as_factor) : Results do not have equal lengths". Any clue why's that and what I could do about it? – ilka Jul 27 '22 at 22:20
  • Hi @ilka! Sorry, I don't use `plyr` anymore. Here's how I would do with `data.table` (no need for `zoo` either): `library(data.table)`; `setDT(d)` (or use `fread` if you read data from file); `k = 1:4`; `d[ , (paste0("lead", k)) := shift(Var, k, type = "lead"), by = ID]`. Good luck! – Henrik Jul 27 '22 at 22:37
  • Thanks for the reply, @Henrik! I also usually use ``data.table`` for leads/lags but the way the data I'm currently working with are organized don't allow for ``shift()`` (at least it doesn't produce the desired result). I got right results with the ``slide()``-function from ``DataCombine``. – ilka Sep 08 '22 at 22:34