-1

I have an coding exercise: create a "function", then use "for" to calculate the cumulative by id and date. I can only use cumsum() and for, all packages are not allowed.

For instance, I create a data frame below

df=data.frame("date"=c("1/1/2020","1/1/2020","1/1/2020","2/1/2020","2/1/2020","2/1/2020","3/1/2020","3/1/2020","3/1/2020"), "id"=c("A","B","C","A","B","C","A","B","C"),"val"=c(5,6,7,8,4,5,6,3,4))
PLASMA chicken
  • 2,777
  • 2
  • 15
  • 25
MMC
  • 11
  • 1

1 Answers1

3

I voted your question down as SO is not a free coding service. Nevertheless, your problem is simple and there are many ways to resolve this. I had to make some fixes to your DF:

df = data.frame(
  "date" = as.Date(c(
    "1/1/2020",
    "1/1/2020",
    "1/1/2020",
    "2/1/2020",
    "2/1/2020",
    "2/1/2020",
    "3/1/2020",
    "3/1/2020",
    "3/1/2020"
  ), format = "%d/%m/%Y"),
  "id" = c("A", "B", "C", "A", "B", "C", "A", "B", "C"),
  "val" = c(5, 6, 7, 8, 4, 5, 6, 3, 4),
  stringsAsFactors = FALSE
)

Followed by (a dplyr example, just one of many ways):

library(dplyr)
summary_df <- df %>%
  group_by(date, id) %>%
  summarise(sum = cumsum(val))

Resulting in:

> summary_df
# A tibble: 9 x 3
# Groups:   date [3]
  date       id      sum
  <date>     <chr> <dbl>
1 2020-01-01 A         5
2 2020-01-01 B         6
3 2020-01-01 C         7
4 2020-01-02 A         8
5 2020-01-02 B         4
6 2020-01-02 C         5
7 2020-01-03 A         6
8 2020-01-03 B         3
9 2020-01-03 C         4
Paul van Oppen
  • 1,443
  • 1
  • 9
  • 18