2

I'm new using the zoo package, so maybe it's an easy question. I have the following data frame (df):

library(lubridate)
library(zoo)
library(dplyr)

Date <- c("2010-01-28", "2010-01-28", "2010-02-28", 
           "2010-02-28", "2010-03-28", "2010-03-28",
           "2010-04-28", "2010-04-28")

Date <- as_date(Date)
Amount <- 1:8
Prod <- c("Corn", "Potato","Corn", "Potato","Corn", "Potato","Corn", "Potato")
df <- data.frame(Date, Prod, Amount)
print(df)

 Date         Prod    Amount
 2010-01-28   Corn      1
 2010-01-28 Potato      2
 2010-02-28   Corn      3
 2010-02-28 Potato      4
 2010-03-28   Corn      5
 2010-03-28 Potato      6
 2010-04-28   Corn      7
 2010-04-28 Potato      8

What I want is to calculate the rolling sum for each variable, with a "window" of 3 days, and then make a new data frame, equal as follows:

 Date        Prod     Amount
 2010-03-28   Corn      9
 2010-03-28 Potato      12
 2010-04-28   Corn      15
 2010-04-28 Potato      18

Probably rollapply() and dplyr could do the job, but I don't know how to resolve this. I appreciate it if someone can help :)

Johan Rosa
  • 2,797
  • 10
  • 18
Jones
  • 333
  • 1
  • 11

1 Answers1

1

I did it using dplyr::lag()

library(dplyr)
library(tibble)

## Data
data <- tribble(
  ~Date,        ~Prod,    ~Amount,
  "2010-01-28",   "Corn",      1,
  "2010-01-28", "Potato",      2,
  "2010-02-28",   "Corn",      3,
  "2010-02-28", "Potato",      4,
  "2010-03-28",   "Corn",      5,
  "2010-03-28", "Potato",      6,
  "2010-04-28",   "Corn",      7,
  "2010-04-28", "Potato",      8
)

# Code

data %>% 
  group_by(Prod) %>% 
  mutate(cum_amount = Amount + lag(Amount, 1) + lag(Amount, 2)) %>% 
  filter(!is.na(cum_amount))


# A tibble: 4 x 4
# Groups:   Prod [2]
  Date       Prod   Amount cum_amount
  <chr>      <chr>   <dbl>      <dbl>
1 2010-03-28 Corn        5          9
2 2010-03-28 Potato      6         12
3 2010-04-28 Corn        7         15
4 2010-04-28 Potato      8         18

Update in order to your comment

data %>% 
  group_by(Prod) %>% 
  mutate(cum_amount = c(rep(NA, 2), zoo::rollsum(Amount, 3))) %>% 
  filter(!is.na(cum_amount))

PS: Remember to include the R tag in your questions

Johan Rosa
  • 2,797
  • 10
  • 18
  • Thank you! problem solved. But I will go a little further: is it possible to have a more generic code? Suppose I had a 10-day "window". Instead of repeating `lag (Amount, n)` 10 times, would it be possible to make it cleaner? – Jones Oct 27 '20 at 23:04
  • 1
    I updated my answer. Let me know in case of something else. – Johan Rosa Oct 28 '20 at 20:37