Magic in the way R evaluates function arguments

Question

Consider the following R code:

y1 <- dataset %>% dplyr::filter(W == 1)

This works, but there seems to some magic here. Usually, when we have an expression like foo(bar), we should be able to do this:

baz <= bar
foo(baz)

However, in the presented code snippet, we cannot evaluate W == 1 outside of dplyr::filter()! W is not a defined variable.

What's going on?

`W` only exists in the scope of `dataset` - so you can evaluate `dataset$W == 1` in the same way. — thelatemail, May 08 '18 at 00:54
If I'm understanding the question correctly, this is also related to non-standard evaluation. There's a good chapter on the subject: http://adv-r.had.co.nz/Computing-on-the-language.html — Adam Bethke, May 08 '18 at 01:05
@AdamBethke Thank you! If you paste the relevant bits from that link into an answer, I'm happy to accept it. — Yatharth Agarwal, May 08 '18 at 03:56

score 2 · Accepted Answer · answered May 08 '18 at 11:37

dplyr uses a concept called Non-standard Evaluation (NSE) to make columns from the data frame argument accessible to its functions without quoting or using dataframe$column syntax. Basically:

[Non-standard evaluation] is a catch-all term that means they don’t follow the usual R rules of evaluation. Instead, they capture the expression that you typed and evaluate it in a custom way.¹

In this case, the custom evaluation takes the argument(s) given to dplyr::filter, and parses them so that W can be used to refer to the dataset$W. The reason that you can't then take that variable and use it elsewhere is that NSE is only applied to the scope of the function.

NSE makes a trade-off: functions which modify scope are less safe and/or unusable in programming where you're building a program that uses functions to modify other functions:

This is an example of the general tension between functions that are designed for interactive use and functions that are safe to program with. A function that uses substitute() might reduce typing, but it can be difficult to call from another function.²

For example, if you wanted to write a function which would use the same code, but swap out W == 1 for W == 0 (or some completely different filter), NSE would make that more difficult to accomplish.

In 2017 the tidyverse started to build a solution to this in tidy evaluation.

Magic in the way R evaluates function arguments

1 Answers1