I have an unbalanced panel (with unequally spaced measurement points) and would like to create a lagged variable of x
by group (Variable: id
) but only for consecutive time points. My data looks like this:
# simple example with an unbalanced panel
base <- data.frame(id = rep(1:2, each = 7),
time = c(1, 2, 3, 4, 7, 8, 10, 3, 4, 6, 9, 10, 11, 14),
x = rnorm(14, mean = 3, sd = 1))
I already tried this code using dplyr
:
base_lag <- base %>% # Add lagged column
group_by(id) %>%
dplyr::mutate(lag1_x = dplyr::lag(x, n = 1, default = NA)) %>%
as.data.frame()
base_lag # Print updated data
However, this way I get a lagged variable regardless of the fact that in some cases it is not two consecutive time points.
My final data set should look like this:
id time x lag1_x
1 1 1 3.437416 NA
2 1 2 2.300553 3.437416
3 1 3 2.374212 2.300553
4 1 4 4.374009 2.374212
5 1 7 1.177433 NA
6 1 8 1.543353 1.177433
7 1 10 3.222358 NA
8 2 3 3.763765 NA
9 2 4 3.881182 3.763765
10 2 6 4.754420 NA
11 2 9 4.518227 NA
12 2 10 2.512486 4.518227
13 2 11 3.129230 2.512486
14 2 14 2.152509 NA
Does anyone here have a tip for me on how to create this lagged variable? Many thanks in advance!