0

In the case that I am trying to tackle I have the States of Germany, dates and new cases of COVID per day (for different age groups) in a data frame. Looks similar to this:

State Date Cases Age bracket
Bavaria 01-01-2021 1 14-29
Bavaria 01-01-2021 5 30-50
Bavaria 02-01-2021 9 14-29
Bavaria 02-01-2021 10 30-50
Sachsen 01-01-2021 12 14-29
Sachsen 01-01-2021 3 30-50
Sachsen 02-01-2021 13 14-29
Sachsen 02-01-2021 6 30-50

I am trying to calculate the seven days incidence and I found this piece of code:

library(dplyr)

df %>%
  group_by(group = cut(date_entered, '7 days')) %>%
  summarise(date_range = paste(min(date_entered), min(date_entered) + 6, sep = '-'), 
            sum_new = sum(new)) %>%
  select(-group)

From this answer on a similar question: find the sum after every seventh day but with missing days in R

The output however is that the seven day incidence is calculated, but disregarding from which State the cases come from. Therefore, I am wondering if there is a way to calculate the seven day incidence, but for each State separetely.

harre
  • 7,081
  • 2
  • 16
  • 28
  • 1
    Add state and age bracket to the group_by – Allan Cameron Jun 08 '22 at 15:26
  • 1
    Indeed adding this solved it for me: library(dplyr) df %>% group_by(State, group = cut(date_entered, '7 days')) %>% summarise(date_range = paste(min(date_entered), min(date_entered) + 6, sep = '-'), sum_new = sum(new)) %>% select(-group) – Aleksandar Markovski Jun 08 '22 at 16:48

0 Answers0