For a model I'm building, I want to create multiple lag terms for every field/vector in my data table:
For example, with the following data table:
a<-c('x','x','x','y','y','y')
b<-runif(6, min=0, max=20)
c<-runif(6, min=50, max=1000)
df<-as.data.table(data.frame(a,b,c))
I can use the following code to create 2 lag terms for variable b within each group a:
df[,c(paste("b","_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(b, -i))),by=a]
However, my problem comes when I try to apply this code to a large data table (100+ variables), I would not want to repeat 100+ lines of code (1 line for each variable).
I tried to put the code inside of a loop with a list of variable names, but the variable names in the list cannot seem to be recognized or passed into the code properly:
looplist <- colnames(df[,!1])
for (l in looplist) {
df[,c(paste(l,"_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(l, -i))),by=a]
}
Any advice on how to make this loop work across variables, or any other methods to accomplish the same objective (create multiple LAG terms for each and every variable in the data table) will be greatly appreciated!