I am having some problems using the lag
function in dplyr
. This is my dataset.
ID <- c(100, 100, 100, 200, 200, 300, 300)
daytime <- c("2010-12-21 06:00:00", "2010-12-21 09:00:00", "2010-12-21 13:00:00 ", "2010-12-23 23:00:00", "2010-12-24 02:00:00", "2010-12-25 19:00:00", "2010-12-31 08:00:00")
lagfirstvisit <- c(0, 0, 2, 0, 1, 0, 0)
table <- cbind(ID, daytime, lagfirstvisit)
table <- as.data.frame(table)
table$daytime <- as.POSIXct(table$daytime)
My aim is to generate a new column with the lag of variable daytime
by the number as indicated in the lagfirstvisit
column. i.e. If lagfirstvisit == 2
, I would want the lag2 daytime
value of the particular ID. If lagfirstvisit == 0
, it would mean to keep the observation row's original daytime
value.
My expected result is as follow:
ID <- c(100, 100, 100, 200, 200, 300, 300)
daytime <- c("2010-12-21 06:00:00", "2010-12-21 09:00:00", "2010-12-21 13:00:00 ", "2010-12-23 23:00:00", "2010-12-24 02:00:00", "2010-12-25 19:00:00", "2010-12-31 08:00:00")
lagfirstvisit <- c(0, 0, 2, 0, 1, 0, 0)
result <- c("2010-12-21 06:00:00", "2010-12-21 09:00:00", "2010-12-21 06:00:00", "2010-12-23 23:00:00", "2010-12-23 23:00:00", "2010-12-25 19:00:00", "2010-12-31 08:00:00")
table.results <- cbind(ID, daytime, lagfirstvisit, result)
Currently, the code I am using is:
table <- table %>%
group_by(ID) %>%
mutate(result = lag(as.POSIXct(daytime, format="%m/%d/%Y %H:%M:%S", tz= "UTC"), n = as.integer(lagfirstvisit)))
However, I get the error:
Error in mutate_impl(.data, dots) : Evaluation error: n must be a non-negative integer scalar, not integer of length 3.
Does, anyone out there know how do I resolve this problem? Thank you very much!