I have a dataset that looks like below
df <- data.frame(
id = 1:6, # New id column
drinks_on_previous_day1 = c(5,4,2,1,5,4),
drinks_on_previous_day2 = c(5,4,2,1,5,5),
drinks_on_previous_day3 = c(5,4,2,1,5,6),
drinks_on_previous_day4 = c(5,4,2,1,5,7),
drinks_on_previous_day5 = c(5,4,2,1,5,8),
completion_date = as.Date(c("2023-04-18 16:32:17", "2023-04-21 16:32:17", "2023-04-24 16:32:17", "2023-04-25 16:32:17", "2023-05-01 16:32:17", "2023-05-02 16:32:17"))
)
I have the drinks questions indicating the previous days, but I don't have exact dates. These would be different for each participant.
I have their completion date.
For each participant, First I'd like to calculate the dates based on their completion date. I am completely lost as to how to do that for each participant. Finally, I would like the data to be in a long format.
Here is what the dataframe should look like:
df_long <- df %>%
pivot_longer(
cols = starts_with("drinks_on_previous_day"), # Select columns to pivot
names_to = "day", # Name of the new column for day values
values_to = "drinks" # Name of the new column for drinks values
)
But instead of column names in the day column, I'd like to be the exact 5 dates for each participant based on their completion date.
Finally, I would like to plot a heatmap with the drinks on each day with one line for each participant with different colors for different number of drinks.
>5 drinks -> red
<5 drinks -> green