0
data <- data.frame(
  record_id = c(1,1,1,2,3,4,4,5,6,7,8,8,9,10,10,10),
  visit_date = c("2020-09-24", "2020-12-05", "2021-03-01", "2021-10-03", "2021-10-01", "2021-10-05", "NA", "2021-08-25", "2021-09-19", "2021-10-01", "2021-09-27", "2021-09-07", 
"2021-10-03", "2021-10-08", "2022-03-22", "2022-07-12"),
  repeat_instance = c(0,1,2,0,0,0,1,0,0,0,0,1,0,0,1,2),
  Time_Since_Appointment = c("NA", "72d 1H 0M 0S", "86d 0H 0M 0S", "NA", "NA", "NA", "NA",
"NA", "NA", "NA", "NA", "1076d 0H 0M 0S", "NA", "NA", "165d 0H 0M 0S", "112d 0H 0M 0S")
)

data1 <- data %>% 
  group_by(record_id) %>%
  mutate(Time_Since_Appointment = visit_date - lag(visit_date))

I am interested in calculating a new variable that would capture the number of days or weeks since today using the most recent repeat_instance, which differs for each record, or if there is not a repeat_instance calculates the time since the baseline visit date.

Thank you so much and please let me know if additional information should be supplied.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • So what's the desired output for this sample input? Are the `Time_Since_Appointment ` correct but you want to automatically generate them? Are your actual visit_date values Date objects, or are they strings like you have in your example? – MrFlick Aug 29 '22 at 13:42

1 Answers1

0

First, I would take the maximum (most recent) date per ID. Next, I would then use the difftime() function to find the difference between the most recent visit and today. In this case the function today() calls today's date, but this could be replaced with a hard-coded date. The difftime() function allows you to select the units that you would like. In this case I calculated the difference in both days and weeks.

Using the data.table package:

data <- data.table(
  record_id = c(1,1,1,2,3,4,4,5,6,7,8,8,9,10,10,10),
  visit_date = c("2020-09-24", "2020-12-05", "2021-03-01", "2021-10-03",
                 "2021-10-01", "2021-10-05", "NA", "2021-08-25",
                 "2021-09-19", "2021-10-01", "2021-09-27", "2021-09-07", 
                  "2021-10-03", "2021-10-08", "2022-03-22", "2022-07-12"))

data <- data[,.(last_visit = max(as.Date(visit_date,"%Y-%m-%d"))),by=.(record_id)]
data <- data[,days_since_last_visit := difftime(today(),last_visit,units="days")]
data <- data[,weeks_since_last_visit := difftime(today(),last_visit,units="weeks")]

data
    record_id last_visit days_since_last_visit weeks_since_last_visit
 1:         1 2021-03-01              546 days        78.000000 weeks
 2:         2 2021-10-03              330 days        47.142857 weeks
 3:         3 2021-10-01              332 days        47.428571 weeks
 4:         4       <NA>               NA days               NA weeks
 5:         5 2021-08-25              369 days        52.714286 weeks
 6:         6 2021-09-19              344 days        49.142857 weeks
 7:         7 2021-10-01              332 days        47.428571 weeks
 8:         8 2021-09-27              336 days        48.000000 weeks
 9:         9 2021-10-03              330 days        47.142857 weeks
10:        10 2022-07-12               48 days         6.857143 weeks
RyanF
  • 119
  • 6