5

In the data.table package, there is a shift function which helps process data in consecutive rows. Like this:

> shift(1:10, 1:3, type = "lag")
[[1]]
 [1] NA  1  2  3  4  5  6  7  8  9

[[2]]
 [1] NA NA  1  2  3  4  5  6  7  8

[[3]]
 [1] NA NA NA  1  2  3  4  5  6  7

And I know that in dplyr package, there are functions like lead and lag which do about the same thing as shift in the data.table package. But the problem is that you can not really specify consecutive rows but only check two rows at the same time. For example:

> lag(1:10, 1)
 [1] NA  1  2  3  4  5  6  7  8  9
> lag(1:10, 2)
 [1] NA NA  1  2  3  4  5  6  7  8
> lag(1:10, 3)
 [1] NA NA NA  1  2  3  4  5  6  7

But you cannot do something like lag(1:10, 1:3) which gives errors as

lag(1:10, 1:3)

Error in lag(1:10, 1:3) : n must be a single positive integer In addition: Warning message: In if (n == 0) return(x) : the condition has length > 1 and only the first element will be used

So my question is that if there is a function in dplyr that corresponds to the function shift in data.table. Any clarification will be appreciated!

Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 1
    I think you're stuck with `lapply(1:3, function(i) dplyr::lag(1:10, i))` – jaimedash Apr 15 '16 at 20:36
  • 6
    You could follow the advice in the accepted answer here: http://stackoverflow.com/q/33507868/ – Frank Apr 15 '16 at 21:05
  • 1
    Thanks! I just figured that I can steal the function `shift` from `data.table`. – Psidom Apr 15 '16 at 21:09
  • 3
    Stealing it with `shift <- data.table::shift` shouldn't be an issue, as data.table doesn't have any dependencies since 1.9.7 and it builds very fast. So it won't slow or bloat environment setup. BTW. feel free to self answer your question. – jangorecki Apr 15 '16 at 22:14

0 Answers0