9

I would like to mix data.table pipelining with magrittr pipelining. I can go from data.table to %>% but I can't figure out how to get back to [][] data.table style pipelining.

Here's an example:

> tbl = data.table(grp=c(1,1,1,2,2,2,3,3,3,4,4), y=rnorm(11))
> tbl
    grp        y
 1:   1  0.08150
 2:   1  1.51330
 3:   1 -0.26154
 4:   2 -0.12746
 5:   2  0.10747
 6:   2  0.16502
 7:   3  0.54139
 8:   3 -0.04194
 9:   3  0.02373
10:   4  2.00756
11:   4  1.05523
> tbl[, .(.N, mean(y)), by=grp][order(-N)] %>% head(n=3) %>% .[, N := NULL]
   grp      V2
1:   1 0.44442
2:   2 0.04834
3:   3 0.17439
> tbl[, .(.N, mean(y)), by=grp][order(-N)] %>% head(n=3) %>% .[, N := NULL][, plot(grp, V2)]
Error in `[.data.table`(., .[, `:=`(N, NULL)], , plot(grp, V2)) : 
  'by' or 'keyby' is supplied but not j
Calls: %>% ... freduce -> withVisible -> <Anonymous> -> [ -> [.data.table
> 

How can I go back to [][] after %>% ?

I know that this particular example could be rewritten entirely with [] and no %>%, but I'm not interested in doing that every time. I'd like a way to be able to write [][] %>% [][] patterns.

Frank
  • 66,179
  • 8
  • 96
  • 180
Clayton Stanley
  • 7,513
  • 9
  • 32
  • 46

3 Answers3

3

Both the previous answers overlook your ability to specify, to some extent, precedence. You can do it by upping the precedence of the %>% part of the code by enclosing it in {}s:

x <- data.frame(a=1:5, b=6:10)
{x %>% subset(a<4) %>% data.table()}[, mean(b)]

Not pretty, but it works:

> {x %>% subset(a<4) %>% data.table()} [, mean(b)]
[1] 7
jbowman
  • 193
  • 6
  • Appreciate it, but I'm looking for pretty. The main problem here is how much cursor movement backwards you'd have to do, when your mind wants to move forward to the next step in the pipeline. – Clayton Stanley May 02 '17 at 22:56
  • I admit I came to this question while looking for a pretty way of doing this myself! – jbowman May 10 '17 at 19:26
2

You can do

 `tbl %>% filter(y>0) %>% data.table()` 

to convert the pipeline result to data.table, for example, for nicely printing the results - in a data.table way. But, unfortunately, you cannot do something like

 `tbl %>% filter(y>0) %>% data.table() [, mean(y), by=group]

I wonder, if this functionality could be added to future data.table versions - maybe through new syntax (to overcome the precedence order limitation, as "[" is performed prior to "<").

IVIM
  • 2,167
  • 1
  • 15
  • 41
1

You can't. [ has higher precedence than %any%, so it will always be evaluated first.

eddi
  • 49,088
  • 6
  • 104
  • 155