1

This is the data frame

Date        ID cost Value
15/12/2016   1  yes    200
15/10/2016   1  yes    100
15/9/2016    1  yes    55
15/04/2016   1  yes    1000
15/12/2016   2  yes    300
15/10/2016   2  yes    200
15/9/2016    2  yes    100
15/04/2016   2  yes    1000
15/12/2016   3  no     300
15/10/2016   3  no     200
15/9/2016    3  no     100
15/04/2016   3  no     1000

I'd like to repeat the rolling sum on 3 month for each ID that have a cost = "yes". Note that in the example the IDs are just 3 but in my DB are n.

The output should be

Date        ID  Value  Rolling_Sum
15/12/2016   1   200   355
15/10/2016   1   100   155
15/9/2016    1   55    55
15/04/2016   1   1000  1000
15/12/2016   2   300   600
15/10/2016   2   200   300
15/9/2016    2   100   100
15/04/2016   2   1000  1000

I've seen a lot of example in other questions. one of my biggest problem is that the date is not continue.. so i can have different lag between different data.

Thanks

Tyu1990
  • 147
  • 1
  • 1
  • 6

1 Answers1

2

You could use the foverlaps function from the data.table-package for this:

library(data.table)
library(lubridate)

# convert the data to a 'data.table'
setDT(dt)
# convert the Date column to date-class
dt[, Date := as.Date(Date, '%d/%m/%Y')]
# create an exact same column to be used by the 'foverlaps' function
dt[, bdate := Date]
# create a reference 'data.table' with the 3 month intervals
dtc <- copy(dt)[, bdate := Date %m-% months(3)]
# set the keys for the reference data.table (needed for the 'foverlaps' function) 
setkey(dtc, ID, bdate, Date)
# create the overlaps and summarise
foverlaps(dt[cost=='yes'], dtc, type = 'within')[, .(val = sum(i.Value)), by = .(ID, Date)]

which gives:

   ID       Date  val
1:  1 2016-12-15  355
2:  1 2016-10-15  155
3:  1 2016-09-15   55
4:  1 2016-04-15 1000
5:  2 2016-12-15  600
6:  2 2016-10-15  300
7:  2 2016-09-15  100
8:  2 2016-04-15 1000
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • COOL it works, thanks.. but I don't understand bdate := Date %m-% months(3). How it works? – Tyu1990 Oct 31 '16 at 12:55
  • @Tyu1990 `bdate := Date %m-% months(3)` substracts 3 months from each date, see `?'%m+%'` (when the lubridate-package is loaded). For an explnation of `:=` see the [page on reference sematics](https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-reference-semantics.html) of the `data.table` [Getting started](https://github.com/Rdatatable/data.table/wiki/Getting-started) wiki. – Jaap Nov 01 '16 at 19:50
  • if I add a condition in column criteria, it gives me this Warning: Warning messages: 1: In difftime(e1, e2, units = "days") : Reached total allocation of 7856Mb: see help(memory.size) 2: In difftime(e1, e2, units = "days") : Reached total allocation of 7856Mb: see help(memory.size) – Tyu1990 Nov 02 '16 at 09:58
  • @Tyu1990 Can't reproduce that. Can you either post a new question or ask in teh [R-Public chat-room](http://chat.stackoverflow.com/rooms/25312/r-public)? – Jaap Nov 15 '16 at 19:51