I have incomplete (time) series where I would like to fill up missing values using available recent values and growth rates from another series, by category (countries). Categories, missing values are not equal length. This requires applying a function on a variable sequentially: first I need to take the last available data point (which can be anywhere) and divide it by 1+growth rate, then move to the next data point and do the same.
Example dataset and desired outcome:
require(data.table)
DT_desired<-data.table(category=c(rep("A",4),rep("B",4)),
year=2010:2013,
grwth=c(NA,.05,0.1,0,NA,0.1,0.15,0.2))
DT_desired[,values:=c(cumprod(c(1,DT_desired[category=="A"&!is.na(grwth),grwth]+1)),cumprod(c(1,DT_desired[category=="B"&!is.na(grwth),grwth]+1)))]
DT_example <- copy(DT_desired)[c(1,2,3,5),values:=NA]
What I tried: you can do it by a for loop, but that is inefficient and discouraged in R. I came to like the efficiency of data.table, and I would preferably do it in that way. I have tried the shift function of data table, which only fills one missing value (which is logical as it tries to execute at the same time I guess, when the rest is missing the previous value).
DT_example[,values:=ifelse(is.na(values),shift(values,type = "lead")/(1+shift(grwth,type = "lead")),values),by=category]
I gather from other posts that you probably can do it with the rollapply function of the zoo package, but I just got the feeling that I should be able to do it in data table without yet another additional package, and that the solution is relatively simple and elegant, just I am not experienced enough to find it.
This may very well be a duplicate and sorry if I did not notice the appropriate post, but none of what I found did exactly what I want.