2

The Chapter 19 of Advanced R explains that expr() is not useful inside the function.

However, in the following case, I could not make a function work without expr().

Let's suppose I want to group a tibble in a function.

data(iris)
iris %>% group_by(Species)

The obvious approach is to use "curly curly".

func_a <- function(data, grouping) {
  data %>% group_by({{grouping}})
}
func_a(iris, Species)

However, "curly curly" does not work if I allow an arbitrarily supplied expressions.

func_b <- function(data, ...) {
  data %>% group_by({{...}})
}
func_b(iris, Species)
# Error in (function (x)  : object 'Species' not found

In the end, I found I need to expr() to make it work.

func_c <- function(data, ...) {
  grouping <- expr(...)
  data %>% group_by(!!grouping)
}
func_c(iris, Species)

The example of expr() in Advanced R is:

f1 <- function(x) expr(x)
f1(a + b + c)
#> x

My main question is why func_c works. Does expr() take ... as it is and evaluate it with !!? Why we have to take a different approach for ...?

Then, I am not sure why this does not work.

func_d <- function(data, grouping) {
  grouping <- expr(grouping)
  data %>% group_by(!!grouping)
}
func_d(iris, Species)

I also checked rlang manual, but the explanation is too brief for me.

user51966
  • 967
  • 3
  • 9
  • 21
  • 1
    I'm confused. Do you want to be able to pass only one column to your group_by or do you want to be able to pass multiple columns? Because I don't think any of your current versions allow for that at the moment. Also you should look into the difference between `enexpr()` and `expr()` when it comes to capturing symbols passed to a function. I'm struggling to answer your question because it's still not clear to me exactly what you are asking. Do you want to hear why each of those functions are wrong or do you just want a working answer? – MrFlick Jul 06 '20 at 01:58
  • I used a single column to consider a simple case, but I found `func_c` failed with multiple columns. To make it work, I had to use `rlang::exprs()` and `!!!` instead of `!!`. Yes, I am still not sure the difference between `enexper()` and `exper()` as you pointed out. In multiple columns case, both `rlang::exprs()` and `rlang::enexprs()` with `!!!` worked. The documentation says the latter is for "expressions supplied by the user of your function," but I don't know why there is no difference in this case. – user51966 Jul 06 '20 at 04:42
  • 1
    @user51966 `expr()` captures the expression provided to it. `enexpr()` captures the expression provided to the "parent" function, instead. `exprs()` and `enexprs()` are list equivalents. – Artem Sokolov Jul 06 '20 at 04:46
  • @ArtemSokolov Thanks. What do you mean by "parent"? For example, `f1` returns `Species` but `f2` raises an error: `f1 <- function(data, ...) { expr(...) } ; f1(iris, Species) ; f2 <- function(data, ...) { rlang::enexpr(...) } ;f2(iris, Species)`. However, `f3` and `f4` return the same thing. `f3 <- function(data, ...) { rlang::exprs(...) } ; f3(mtcars, gear, carb) ; f4 <- function(data, ...) { rlang::enexprs(...) } ;f4(mtcars, gear, carb)`. – user51966 Jul 06 '20 at 04:58
  • 1
    Use `enxpr` when you have a named parameter: `f2 <- function(data, g) { rlang::enexpr(g) } ;f2(iris, Species)`. `...` work very differently so you dont have to be as careful. Just make sure to use the plurals with `...` to be safe because you never know how many things are in there. – MrFlick Jul 06 '20 at 05:22
  • 2
    @user51966 Dots work a little differently than named arguments. In general, [it's best to just pass them directly to tidyverse functions](https://tidyeval.tidyverse.org/multiple.html#simple-forwarding-of-...) unless you need access to the values inside your function. – Artem Sokolov Jul 06 '20 at 14:44
  • 1
    @user51966 The difference between `expr()` and `enexpr()` is best demonstrated without dots. Consider, `f <- function(x) {expr(x)}` and `g <- function(x) {enexpr(x)}`. Calling `f(y)` will return `x` because `x` was passed to `expr()`. Calling `g(y)` will return `y` because `y` was passed to the "parent" function `g`. – Artem Sokolov Jul 06 '20 at 14:46

2 Answers2

2

expr prevents the evaluation of code. For instance, attempting to run x by itself will fail unless x is a variable you've previously declared - R will go looking for the value in x when you evaluate it, and will issue an error if no such value is found. In contrast, expr(x) will never fail (even if x hasn't been declared yet), because the expr tells R "take this at face value and don't go looking for something else it might represent". expr(x) will return something of the type "name", which is basically just that - a name. You can think of a name as the interface between you and R - it's what you type in, and how you communicate your instructions. eval(expr(x)) is the same as just doing x.

Now taking your examples in order:

func_c <- function(data, ...) {
  grouping <- expr(...)
  data %>% group_by(!!grouping)
}
func_c(iris, Species)

This works because Species will be directly passed to expr through the ..., and the return type will be a "name", which gets stored in the variable grouping. A name can be evaluated with !! as you did it, or with eval. Either way, doing !!grouping will cause R to first go looking for what the grouping variable represents, finding Species. The variable gets replaced with its value, and finally !!Species will tell R to go looking for the variable called Species, which within the context of the group_by function will give you the column called "Species".

Moving on to your next example:

f1 <- function(x) expr(x)
f1(a + b + c)

This doesn't work simply because expr(x) blocks any evaluation. R doesn't go looking for what's inside x, so it never finds a + b + c, it takes the x at face value, and that's what you get.

Finally, we have your last example:

func_d <- function(data, grouping) {
  grouping <- expr(grouping)
  data %>% group_by(!!grouping)
}
func_d(iris, Species)

This is similar to your first example, but there's an extra variable at play here - the parameter called grouping. In your first example, Species entered the function directly (through ...), so it wasn't bound to any parameter name. In this third example, Species enters the function through a named parameter, i.e. bound to the variable grouping. However, expr(grouping) tells R "don't bother looking for what grouping represents, I have everything I need right here"... so it never finds Species at all. expr(grouping) just gives you the name grouping, regardless of whatever's in the variable itself. So then when you try to evaluate that using !!grouping within group_by, R tries to look for a column name called grouping... Needless to say, it doesn't find it, and you get a Column grouping is not found error.

Count Orlok
  • 997
  • 4
  • 13
0

Think of it as expr() and !! effectively negating each other.

func_c <- function(data, ...) {
  grouping <- expr(...)
  data %>% group_by(!!grouping)
}

is equivalent to

func_c <- function(data, ...) {
  data %>% group_by(...)            # Proper way to handle dots, by the way
}

(It's not an exact equivalence, because the former implementation expands dots in expr, while the latter does it in group_by. But both implementations will produce the same output when a single column symbol is supplied to ....)

Likewise,

func_d <- function(data, grouping) {
  grouping <- expr(grouping)
  data %>% group_by(!!grouping)
}

is equivalent to

func_d <- function(data, grouping) {
  data %>% group_by(grouping)         # No column `grouping` in iris
}

To get func_d working, you need to replace expr() with enexpr(). This will capture the expression provided to the function, as opposed to the expression grouping itself.

Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74