1

In this question, the following throws an error:

subset2 = function(df, condition) {
  condition_call = eval(substitute(condition),df )  
  df[condition_call,]
}

df = data.frame(a = 1:10, b = 2:11)
condition = 3

subset2(df, a < condition)
## Error in eval(substitute(condition), df) : object 'a' not found

Josh and Jason from the original question did a great job explaining why this is. What I don't get is why supplying the enclos argument to eval apparently fixes it.

subset3 = function(df, condition) {
  condition_call = eval(substitute(condition), envir = df, enclos = parent.frame())
  df[condition_call, ]
}

subset3(df, a < condition)
##   a b
## 1 1 2
## 2 2 3

I understand that skipping the function environment means R is no longer trying to evaluate the promise, and instead grabs the condition object from the global environment.

But I think supplying enclos = parent.frame() should not make a difference. From ?eval on the enclos argument:

Specifies the enclosure, i.e., where R looks for objects not found in envir.

But if not provided, it defaults to:

enclos = if(is.list(envir) || is.pairlist(envir)) parent.frame() else baseenv())

which, in my mind, should resolve to parent.frame() anyway, because surely, df satisfies the is.list() check.

This means that as long as some object data returns TRUE on is.list(), the behavior of eval(expr, envir = data) and eval(expr, envir = data, enclos = parent.frame()) should be identical. But as evidenced by the above, it isn't.

What am I missing?

EDIT: Thanks to SmokeyShakers who pointed out the difference between default and user-supplied arguments regarding the time of evaluation. I think this is actually already expressed here: https://stackoverflow.com/a/15505111/2416535

It might make sense to keep this one alive though, as it touches eval() specifically (the other does not), and it is not trivial to realize what the generalized question should be until one has the answer.

jakub
  • 4,774
  • 4
  • 29
  • 46
  • 1
    Is it because of where `parent.frame()` is evaluated? When supplied, it's evaluated in the environment you call from. When not supplied, it's evaluated in the function's internal environment. – SmokeyShakers Jan 03 '20 at 20:16
  • @SmokeyShakers that makes perfect sense. If you happen to have a link to a place where this is described (more specific than "Hadley's book" or "R manuals"), I'd be happy to accept that as an answer. Alternatively, if this is a duplicate, I'm happy to close. – jakub Jan 03 '20 at 20:21
  • I 100% got it from Advanced R. I'll try and find the page. – SmokeyShakers Jan 03 '20 at 20:29

2 Answers2

0

So, different parents. In the example that doesn't work parent.frame is looking up from inside eval into the internal environment of subset2. In the working example, parent.frame is looking up from inside subset3, likely to your global where your df sits.

Example:

tester <- function() {
  print(parent.frame())
} 

tester() # Global

(function() {
  tester()
})() # Anonymous function's internal env
SmokeyShakers
  • 3,372
  • 1
  • 7
  • 18
0

IT is getting confused by the double use of condition. Change one of them to cond:

cond <- 3
subset2(df, a < cond)
##   a b
## 1 1 2
## 2 2 3

Note that even though this works it won't work if you put subset2 into a function since it will look into the global environment where subset2 was defined rather than looking inside the f execution frame and subset3 will be needed.

if (exists("cond")) rm(cond)
f <- function() {
  cond <- 3
  subset2(df, a < cond)
}
f() # error
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • That part I understood. When the `enclos = parent.frame()` is supplied, the `condition` inside the function env is skipped, eliminating confusion. What was confusing me was the behavior of the `enclos` argument in `eval`, specifically why one has to supply `parent.frame()` even though it is the default. But SmokeyShakers explained that. – jakub Jan 03 '20 at 20:37