Original question
Can anyone explain to me why unquote does not work in the following?
I want to pass on a (function) user-specified column name in a call to do
in version 0.7.4 of dplyr
. This does seem somewhat less awkward than the older standard evaluation approach using do_
. A basic (successful) example ignoring the fact that using do
here is very unnecessary would be something like:
sum_with_do <- function(D, x, ...) {
x <- rlang::ensym(x)
gr <- quos(...)
D %>%
group_by(!!! gr) %>%
do(data.frame(y=sum(.[[quo_name(x)]])))
}
D <- data.frame(group=c('A','A','B'), response=c(1,2,3))
sum_with_do(D, response, group)
# A tibble: 2 x 2
# Groups: group [2]
group y
<fct> <dbl>
1 A 3.
2 B 3.
The rlang::
is unnecessary as of dplyr 0.7.5 which now exports ensym
. I have included lionel's suggestion regarding using ensym
here rather than enquo
, as the former guarantees that the value of x
is a symbol (not an expression).
Unquoting not useful here (e.g. other dplyr examples), replacing quo_name(x)
with !! x
in the above produces the following error:
Error in ~response : object 'response' not found
Explanation
As per the accepted response, the underlying reason is that do
does not evaluate the expression in the same environment that other dplyr functions (e.g. mutate
) use.
I did not find this to be abundantly clear from either the documentation or the source code (e.g. compare the source for mutate
and do
for data.frames and follow Alice down the rabbit hole if you wish), but essentially - and this is probably nothing new to most;
do
evaluates expressions in an environment whose parent is the calling environment, and attaches the current group (slice) of the data.frame to the symbol.
, and;- other dplyr functions 'more or less' evaluate the expressions in the environment of the data.frame with parent being the calling environment.
See also Advanced R. 22. Evaluation for a description in terms of 'data masking'.