Avoiding NA in rolling sums of last n observations within by groups using data.table

Question

According to this threat I learned, rolling sums for variable b in the following data.table can be achieved as follows:

data creation + computing rolling sums:

    x <- data.table(a = sample(letters[1:3], 100, replace = TRUE), b = runif(100))
    setorder(x, a)

    # alternative 1
    x[, .(b, Reduce(`+`, shift(b, 0:2))), by = a]
    # alternative 2
    x[, .(b, stats::filter(b, rep(1, 3), sides = 1)), by = a]

Current + desired output:

     a           b        V2   V2_desired
  1: a 0.457665568        NA   0.457665568
  2: a 0.752555834        NA   1.210221
  3: a 0.864672124 2.0748935   2.0748935
  4: a 0.542168656 2.1593966   2.1593966
  5: a 0.197962875 1.6048037   1.6048037

Now there are NAs generated for the first two obs. in each by group. I need to adjust one of the alternatives to sum only the current obs. (last two obs.) in cases where the group index starts (is at position 2). This should be generalizable such that I could consider windows of last n values and the exceptions are handled.

Any idea?

What is the desired output? What do you mean by "*I need to adjust one of the alternatives to sum only the current obs. (last two obs.)*"? Do you mean ```x[, sum(Reduce(`+`, shift(b, 0:2)), na.rm = TRUE), by = a]```? Also please use `set.seed` in order to create your data sets and only 10 obs will be enough too probably (instead of 100) — David Arenburg, Nov 03 '16 at 13:27
I'll second the request for `set.seed` and a smaller example -- you're making it unnecessarily hard to verify that the answer below works. Also, besides reading Q&A, you should read the docs behind the functions you come across, like typing `?shift` into R. — Frank, Nov 03 '16 at 13:36

score 1 · Accepted Answer · answered Nov 03 '16 at 13:32

I'm not 100% sure I'm getting what you need, but the shift function leaves behind NA values by default. You can change that behaviour by passing a fill argument. In your case, since you're summing the data, you might want to try it with fill=0:

set.seed( 123 )
x[, .(b, Reduce(`+`, shift(b, 0:2, fill=0))), by = a]

head returns:

   a         b       V2
1: a 0.5999890 0.599989
2: a 0.8903502 1.490339
3: a 0.7205963 2.210935
4: a 0.5492847 2.160231
5: a 0.9540912 2.223972
6: a 0.5854834 2.088859

Thank you, this is exactly what I needed! Upvote will become active after I reach 15 rep. — PeterPancake, Nov 03 '16 at 13:39

Avoiding NA in rolling sums of last n observations within by groups using data.table

1 Answers1