Consider the dummy example below: I want to run a model on a range of subsets of the data.table in a loop, and want to specify the exact line to iterate as a string (with an iterator i
)
library(data.table)
DT <- data.table(X = runif(100), Y = runif(100))
f1 <- function(code) {
for (i in c(20,30,50)) {
eval(parse(text = code))
}
}
f1("lm(X ~ Y, data = DT[sample(.N, i)])")
Obviously this doesn't return any output as lm()
is merely evaluated in the background 3 times. The actual use case is more convoluted, but this is meant to be a theoretical simplification of it.
The example above, nonetheless, works fine. The problems begin when the function f1
is included in the package, instead of being defined in the global environment. If I'm not mistaken, in this case f1
is defined in the package's base env. Then, calling f1
from global env gives the error: Error in [.data.frame(x, i) : undefined columns selected
. R can correctly access iterator i
in its base env and DT
in the global env, but cannot access the column by name inside data.table's square brackets.
I tried experimenting by setting envir
and enclos
arguments to eval()
to baseenv()
, globalenv()
, parent.frame()
, but haven't managed to find a combination that works.
For example, setting envir = globalenv()
seems to result in accessing DT
and i
, but not X
and Y
from the DT
inside lm()
. Setting envir = baseenv()
we lose the global env and cannot access DT
(envir = baseenv(), enclos = globalenv()
doesn't change it). Using envir = list(baseenv(), globalenv())
results in not being able to access anything inside data.table's square brackets, I think, error message: "Error in [.data.frame(x, i) : undefined columns selected".