4

I am using converting back and forth from data.frame to data.table using a workflow like: setDT() + some data.table operation + setDF() (see this post). Using setDF() will correctly transform back my input into a data.frame.

However, when I use exactly the same code, but within a function, the input is not transformed back into an data.frame anymore after a setDF() call! I suspect this is a difference between transforming locally and globally, but I don't quite understand, and especially, don't see how to get my input back into a data.frame (setDF() outside would work obviously, but that's not the idea).

library(data.table)
data(iris)

## using setDF outside of function
setDT(iris)
iris[, Sepal.Length.Mean := mean(Sepal.Length), by = Species]
setDF(iris)

class(iris)
#> [1] "data.frame"

## using setDF within a function
fo <- function(df){
  setDT(df)
  df[, Sepal.Length.Mean := mean(Sepal.Length), by = Species]
  setDF(df)
  df
}

res <- fo(iris)
class(iris)
#> [1] "data.table" "data.frame"

Created on 2020-11-17 by the reprex package (v0.3.0)

Matifou
  • 7,968
  • 3
  • 47
  • 52
  • 1
    Shouldn't you check `class(res)` since that is where you are storing the returned object? – Ronak Shah Nov 18 '20 at 03:41
  • 1
    You are correct `res` is of the right class, but my target is to leave the class of the input unchanged, and as `setDF()` changes by reference, I hoped that would be the case!? – Matifou Nov 18 '20 at 03:52
  • I think function creates its own copy of the object and doesn't update it by reference. – Ronak Shah Nov 18 '20 at 03:53
  • I would be surprised about! setDF+[]+setDF update by reference, and I don't think that behavior would depend on whether they are outside or within a function!? At least that's what print(lobstr::ref(df)) seems to indicate? – Matifou Nov 18 '20 at 04:00
  • 2
    I distinctly remember that this used to be different but I can reproduce this as far back as data.table 1.9.6 (on R 3.3.0), which is the oldest version I still have installed. You should take this to the data.table issue tracker because AFAICS nothing should trigger a copy here. – Roland Nov 18 '20 at 07:50

0 Answers0