I have to produce a self-referencing variable (ind) that is grouped by an id and has to fulfill a certain condition (e.g., time >1). Here is a toy example:
set.seed(13)
dt <- data.frame(id = rep(letters[1:2], each = 4), time = rep(1:4, 2), ret = rnorm(8)/100)
dt$ind <- if_else(dt$time == 1, 100, as.numeric(NA))
dt
dt <- dt %>%
group_by(id) %>%
mutate(
ind = if_else(time > 1, lag(ind, 1)*(1+ret), ind)
)
This is the output:
Obviously I cannot use mutate in this set up since it is referencing to the initial values of ind and does not update when new values are calculated.
I would like to avoid running a loop. Any ideas how I can compute ind for all time periods most efficiently?
Edit:
Thanks to everyone for the helpful answers! I have a slightly trickier extension of the above issue.
How can I deal with higher lags? E.g., with lag = 2, such that
index_{t} = index_{t-2}*(1+ret_{t})
Here is a sample data frame and a sample outcome that I produced with Excel:
set.seed(13)
dt <- data.frame(id = rep(letters[1:2], each = 5), time = rep(1:5, 2), ret = rnorm(10)/100)
dt$ind <- if_else(dt$time == 1, 120, if_else(dt$time == 2, 125, as.numeric(NA)))