Originally, I thought it would be necessary to explicitly convert the data.table
to a "data table tbl
", using tbl_dt
, to retain class data.table
:
library(data.table)
library(dtplyr)
library(magrittr)
mtcars_dt %>% tbl_dt() %>% dplyr::select(hp, mpg) %>% class
# [1] "tbl_dt" "tbl" "data.table" "data.frame"
mtcars_dt %>% tbl_dt() %>% dplyr::filter(hp > 100) %>% class
# [1] "tbl_dt" "tbl" "data.table" "data.frame"
However, as pointed out by Frank in the comments, merely loading dtplyr
is enough:
mtcars_dt %>% dplyr::select(hp, mpg) %>% class
# [1] "data.table" "data.frame"
mtcars_dt %>% dplyr::filter(hp > 100) %>% class
# [1] "data.table" "data.frame"
Weird. Or? I posted a dtplyr
issue, so hopefully some dtplyr
aficionados can shed some light on this.
The .data
argument and Value are the same in ?filter
and ?select
, so from this information only it's hard to tell why .data
of class data.table
is treated differently in the two functions.
After this little excerise, I would still argue that you should stick to data.table
syntax. In particular, you can chain operations:
mtcars_dt[ , .(hp, mpg)][hp > 100]
# or
mtcars_dt[j = .(hp, mpg)][i = hp > 100]