6

Something really odd is going on here. In the code below, I create a variable called temp. I have to call it twice before I can see what it is. E.g. The first time I call it, the console shows nothing. The second time it shows the data.table/data.frame that it is. Can anyone help me understand what's going on here?

library(magrittr)
library(data.table)

myDT <- as.data.table(mtcars)


temp <- 
    myDT %>%
    melt(id.vars = c('cyl', 'mpg', 'hp'), 
         measure.vars = c('vs','am','gear','carb'),
         variable.name = 'Data') %>%
    extract( value > 0) %>%
    extract( , value := NULL)

What my console is doing (the first call doesn't do anything):

> temp
> temp
    cyl  mpg  hp Data
 1:   4 22.8  93   vs
 2:   6 21.4 110   vs
 3:   6 18.1 105   vs
 4:   4 24.4  62   vs
 5:   4 22.8  95   vs
 ...
 ...
jks612
  • 1,224
  • 1
  • 11
  • 20
  • 1
    I see this too, in RStudio and Rterm (R version 3.2.2 (2015-08-14); Platform: x86_64-w64-mingw32/x64 (64-bit); data.table_1.9.6). `print.data.frame(temp)` works first go. – jbaums Jan 08 '16 at 00:28
  • I've always noticed this after a `:=` or `set()` call in `data.table` – tospig Jan 08 '16 at 00:29
  • 4
    I am certain I've seen this come up before as a known behavior with data.table. There is a duplicate on SO somewhere, probably one of the data.table gurus will know where it is. – joran Jan 08 '16 at 01:07
  • 8
    search? http://stackoverflow.com/questions/32988099/data-table-objects-not-printed-after-returned-from-function or http://stackoverflow.com/questions/34270165/when-and-why-does-print-need-two-attempts-to-print-a-data-table or http://stackoverflow.com/questions/34278964/why-does-a-data-table-from-fread-return-nothing-on-first-print-only – rawr Jan 08 '16 at 01:34

1 Answers1

9

This is the known side-effect of the fix implemented to squash an even bigger bug. It's documented here, as the first item under "BUG FIXES" section of the v1.9.6 release. Quoting from that link:

if (TRUE) DT[,LHS:=RHS] no longer prints, #869 and #1122. Tests added. To get this to work we've had to live with one downside: if a := is used inside a function with no DT[] before the end of the function, then the next time DT or print(DT) is typed at the prompt, nothing will be printed. A repeated DT or print(DT) will print. To avoid this: include a DT[] after the last := in your function. If that is not possible (e.g., it's not a function you can change) then DT[] at the prompt is guaranteed to print. As before, adding an extra [] on the end of a := query is a recommended idiom to update and then print; e.g. > DT[,foo:=3L][]. Thanks to Jureiss and Jan Gorecki for reporting.

As explained there, the solution is to append a trailing [] to the the final :=-containing operation in your function. Here, that would mean doing the following:

library(magrittr)
library(data.table)    
myDT <- as.data.table(mtcars)
temp <- 
    myDT %>%
    melt(id.vars = c('cyl', 'mpg', 'hp'), 
         measure.vars = c('vs','am','gear','carb'),
         variable.name = 'Data') %>%
    extract( value > 0) %>%
    extract( , value := NULL) %>% `[`

## Following which, this will print the first time
temp
jangorecki
  • 16,384
  • 4
  • 79
  • 160
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455