0

I have a data.frame of exercise sessions completed for multiple participants (G1, G2 etc.) There a different types of exercise session (Sup, Home, etc.) with 2 sessions of each type per week. I have a "week" variable, but I need to give summary data by month, i.e. take 52 weeks of data and split it into 12 months. This is example data:

qdf = data.frame(id = rep(c("G1", "G2", "G3"), 16),
                 type = c(rep("Sup",24), rep("Home", 24)),
                 week = rep(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4),2),
                 session = c(rep("S1",3), rep("S2", 3), rep("S3",3), rep("S4",3),
                             rep("S5",3), rep("S6", 3), rep("S7",3), rep("S8",3),
                             rep("H1",3), rep("H2", 3), rep("H3",3), rep("H4",3),
                             rep("H5",3), rep("H6", 3), rep("H7",3), rep("H8",3)))

My current idea on how to do this would be, after grouping by id, to assign a dummy date to each session, starting with 01-01-2020 for sessions S1 and H1, and then splitting the year into months based on dates. Alternatively, to add extra dummy rows, so that each week has 7 days, then assign dates and split.

I'm really not sure how to begin with either of these possible solutions, or if there isn't a better way.

Mark Davies
  • 787
  • 5
  • 18
  • 1
    Mapping (ISO) weeks to months will depend on the year and is not universal. Do you have any other date information? – Maurits Evers May 24 '21 at 09:12
  • @MauritsEvers each participant started at different times over a period of, completing 52 weeks. The exact dates aren't important. I will assign the first approx 4.3 weeks/ 30 days to "Month 1" etc. As an alternative, I could do something like `qdf%>% mutate ( month = case_when(week >= 48 ~ 12, week >= 43 ~11``` etc, grouping either 4 or 5 weeks into a month. There's a bit more accuracy to be had if I consider that each week can be further split by session (i.e. S1 to S8 = month 1) but this would require a lot of lines of code as multiple session types – Mark Davies May 24 '21 at 09:25

1 Answers1

1

If you set a starting point, for instance 2020-01-01, you can do the following:

qdf <- data.frame(id = rep(c("G1", "G2", "G3"), 16),
                 type = c(rep("Sup",24), rep("Home", 24)),
                 week = rep(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4),2),
                 session = c(rep("S1",3), rep("S2", 3), rep("S3",3), rep("S4",3),
                             rep("S5",3), rep("S6", 3), rep("S7",3), rep("S8",3),
                             rep("H1",3), rep("H2", 3), rep("H3",3), rep("H4",3),
                             rep("H5",3), rep("H6", 3), rep("H7",3), rep("H8",3)))

library(lubridate)
qdf <- qdf %>% mutate(
  date = ymd("2020-01-01") + weeks(week),
  month = month(date)
)
> head(qdf)
  id type week session month       date
1 G1  Sup    1      S1     1 2020-01-08
2 G2  Sup    1      S1     1 2020-01-08
3 G3  Sup    1      S1     1 2020-01-08
4 G1  Sup    1      S2     1 2020-01-08
5 G2  Sup    1      S2     1 2020-01-08
6 G3  Sup    1      S2     1 2020-01-08

I haven't removed the date column on purpose so you can check what's going on.

xaviescacs
  • 309
  • 1
  • 5