1

I've noted that dtplyr (released this January 1.0.1) uses as.data.table to bring the variable back to data.table type: https://dtplyr.tidyverse.org/articles/translation.html

I'm a big fan and user of data.table and use it pipeline with dplyr for many years, for which purpose I wrote myself many of those wrapper functions, which are now part of dtplyr.

I'm however using setDT, as I thought it's more efficient as keeps with data.table mentality of assigning by reference.

So I wonder why Hadley is not using it?
And in general - what's more efficient to use of the two, when one needs to convert from data.frame (or tibble) to data.table?

IVIM
  • 2,167
  • 1
  • 15
  • 41
  • 4
    as.data.table is used because it guarantees a copy is made of the input object; in general this would be slower than setDT, but tidyverse principles require immutability & so avoid changing the input object. IIRC there is an option to disable this, I forget the value off the top of my head, you can check the vignette – MichaelChirico Apr 04 '20 at 13:01

0 Answers0