I have a dataset of individual subject records with birth, diagnosis, and death dates. I would like to turn this into longitudinal data that shows whether or not subjects have been born, have been diagnosed (diagnosis can happen before or after birth), or have died for each week over the study period. Note that birth or diagnosis can happen before the study period.
Study period: 4/1/2021 - 4/31/2021 (weeks start Monday, so the actual study weeks would start on 3/29/2021, 4/5/2021, 4/12/2021, 4/19/2021, and 4/26/2021).
Sample data:
tibble(id=seq(1:4),
date_birth=c("2/28/2021", "3/2/2021", "4/3/2021", "4/15/2021"),
date_dx=c("3/4/2021", "4/15/2021", NA, "4/9/2021"),
date_death=c("4/5/2021", "4/20/2021", NA, "4/23/2021"))
Logic:
If date_birth <= study_week, born = 1, else born = 0
If date_dx <= study_week, dx = 1, else dx = 0
If date_death <= study_week, dead = 1, else dead = 0
Subjects never go from 1 back to 0
Desired output:
tibble(study_week=rep(seq.Date(as.Date("3/29/2021", format="%m/%d/%Y"), by="week", length.out = 5),4),
id=c(rep(1,5), rep(2,5), rep(3,5), rep(4,5)),
born=c(1,1,1,1,1,
1,1,1,1,1,
0,1,1,1,1,
0,0,0,1,1),
dx=c(1,1,1,1,1,
0,0,0,1,1,
0,0,0,0,0,
0,0,1,1,1),
dead=c(0,1,1,1,1,
0,0,0,0,1,
0,0,0,0,0,
0,0,0,0,1))
How can I create this output? Thanks!