I have a dataset of peoples blood results in R:
ID Result date
A1 80 01/01/2006
A1 70 01/01/2009
A1 61 01/01/2010
A1 30 01/01/2018
A1 28 01/01/2022
B2 40 01/01/2006
B2 30 01/01/2012
B2 25 01/01/2015
B2 10 01/01/2020
G3 28 01/01/2009
G3 27 01/01/2014
G3 25 01/01/2013
G3 24 01/01/2011
G3 22 01/01/2019
I would like to plot these and then do a linear regression and find the predicted date at which their blood result would hit 15 for each person. In reality there are multiple blood tests per year so I would try and get the median result per year and then plot those and get the predicted year they hit 15.
I can do the median per year and use the lm function along the lines of:
filtered$date_of_bloods <-format(filtered$date_of_bloods,format="%Y")
#split into individual ID groups
a <- with(filtered, split(filtered, list(ID)))
#aggregate median results per year
medianfunc <- function(y) {aggregate(results ~ date_of_bloods, data = y, median)}
medians <- sapply(a, medianfunc)
# do lm per ID cohort and get slope of lines
g<- as.data.frame(medians)
coefLM <- function(x) {coef(lm(date_of_bloods ~ results, data = x))}
coefs<- sapply(g, coefLM)
But I am unsure who to extract and get the prediction for when the slope of the line would hit 15.
Any help would be much appreciated