0

I have an irregular time series, and i'm trying to account with R a rolling sum on 3 month window for each operation associated to an ID.

data are structured as follow

ID    Operation    date         value
A         1       01/01/2017      0
A         2       01/02/2017      1
A         3       01/06/2017      1
A         4       01/09/2017      0
B         1       01/03/2017      0
B         2       01/05/2017      1
B         3       01/09/2017      0
B         4       01/10/2017      1

i'm looking for this output

ID    Operation    date         value   cumsum
A         1       01/01/2017      0        0
A         2       01/02/2017      1        1
A         3       01/06/2017      1        1
A         4       01/09/2017      0        1
B         1       01/03/2017      0        0
B         2       01/05/2017      1        1
B         3       01/09/2017      1        1
B         4       01/10/2017      1        2

now i'm using this script

DB<-DB[with(DB,order(ID,date)),]
DB<-DB %>% group_by(ID) %>% mutate(cumsum = cumsum(value)) 

but it sum value for all past operation. how can i introduce the 3 month rolling sum?

Frank
  • 66,179
  • 8
  • 96
  • 180
Tyu1990
  • 147
  • 1
  • 1
  • 6

1 Answers1

1

It's not possible to flag in advance your 3 month windows, because you want to go back 3 months from every date in your dataset and that means that your reference point (date) changes every time. Therefore you need a function that takes that into account and apply it on every row.

library(lubridate)
library(dplyr)

# sample dataset
dt = read.table(text="ID    Operation    date         value
                A         1       01/01/2017      0
                A         2       01/02/2017      1
                A         3       01/06/2017      1
                A         4       01/09/2017      0
                B         1       01/03/2017      0
                B         2       01/05/2017      1
                B         3       01/09/2017      1
                B         4       01/10/2017      1", header=T, stringsAsFactors=F)

# function that goes 3 months back from a given date and a given ID
f = function(ID_input, date_input) { 
  enddate = date_input
  startdate = date_input - months(3)
  sum((dt %>% filter(ID == ID_input & date >= startdate & date <= enddate))$value) }

f = Vectorize(f)

# update date column
dt$date = dmy(dt$date)

# run function for every row
dt %>% mutate(sumvalue = f(ID, date))


#   ID Operation       date value sumvalue
# 1  A         1 2017-01-01     0        0
# 2  A         2 2017-02-01     1        1
# 3  A         3 2017-06-01     1        1
# 4  A         4 2017-09-01     0        1
# 5  B         1 2017-03-01     0        0
# 6  B         2 2017-05-01     1        1
# 7  B         3 2017-09-01     1        1
# 8  B         4 2017-10-01     1        2
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • You got me confused a bit, because the `value` column you provide as data is not the same as in your desired output. – AntoniosK Jul 28 '17 at 12:30
  • thanks @antoniosK this code do what i'm looking for.. sorry for the type. i've just an error Error in mutate_impl(.data, dots) : cannot coerce type 'closure' to vector of type 'character' – Tyu1990 Jul 28 '17 at 12:37
  • Can you check if you can replicate my exact code? (Including my dataset using `read.table`). If you can do that and the error is based on your actual dataset then we have to make sure that the column types are similar. – AntoniosK Jul 28 '17 at 12:40
  • thanks for your time it works, but in your code the cumulate sum stops after the window (3m) and restarts and this is not my goal. the aim is to obtain a rolling window that goes back of 3m for each operation (per ID) – Tyu1990 Jul 28 '17 at 13:23
  • 1
    Oh, I see. I'll be able to fix that. – AntoniosK Jul 28 '17 at 13:28