I'm trying to add missing lines for "day" and extrapolate the data for "value". In my data each subject ("id") has 2 periods (period 1 and period 2) and values for consecutive days.
An example of my data looks like this:
df <- data.frame(
id = c(1,1,1,1, 1,1,1,1, 2,2,2,2, 2,2,2,2, 3,3,3,3, 3,3,3,3),
period = c(1,1,1,1, 2,2,2,2, 1,1,1,1, 2,2,2,2, 1,1,1,1, 2,2,2,2),
day= c(1,2,4,5, 1,3,4,5, 2,3,4,5, 1,2,3,5, 2,3,4,5, 1,2,3,4),
value =c(10,12,15,16, 11,14,15,17, 13,14,15,16, 15,16,18,20, 16,17,19,29, 14,16,18,20))
For each id and period I am missing data for days 3,2,1,4,1,5, respectively. I want to expand the data to let's say 10 days and extrapolate the data on value column (e.g. with linear regression).
My final df should be something like that:
df2 <- data.frame(
id = c(1,1,1,1,1,1,1, 1,1,1,1,1,1,1, 2,2,2,2,2,2,2, 2,2,2,2,2,2,2, 3,3,3,3,3,3,3, 3,3,3,3,3,3,3),
period = c(1,1,1,1,1,1,1, 2,2,2,2,2,2,2, 1,1,1,1,1,1,1, 2,2,2,2,2,2,2, 1,1,1,1,1,1,1, 2,2,2,2,2,2,2),
day= c(1,2,3,4,5,6,7, 1,2,3,4,5,6,7, 1,2,3,4,5,6,7, 1,2,3,4,5,6,7, 1,2,3,4,5,6,7, 1,2,3,4,5,6,7),
value =c(10,12,13,15,16,17,18, 11,12,14,15,17,18,19, 12,13,14,15,16,18,22, 15,16,18,19,20,22,23, 15,16,17,19,29,39,49, 14,16,18,20,22,24,26))
The most similar example I found doesn't extrapolate by two variables (ID and period in my case), it extrapolates only by year. I tried to adapt the code but no success :(
Another example extrapolates the data by multiple id but doesn't add rows for missing data.
I couldn't combine both codes with my limited experience in R. Any suggestions? Thanks in advance...