6

I am trying to implement advice I am finding in the web but I am halfway where I want to go.

Here is a reproducible example:

library(tidyverse)
library(dplyr)
library(rlang)

data(mtcars)

filter_expr = "am == 1"

mutate_expr = "gear_carb = gear*carb"

select_expr = "mpg , cyl"

mtcars %>% filter_(filter_expr) %>% mutate_(mutate_expr) %>% select_(select_expr)

The filter expression works fine.

The mutate expression works as well but the new variable has the name gear_carb = gear*carb instead of the intended gear_carb.

Finally, the select expression returns an exception.

halfer
  • 19,824
  • 17
  • 99
  • 186
user8270077
  • 4,621
  • 17
  • 75
  • 140
  • where is this advice coming from? – Arthur Yip Apr 14 '18 at 13:03
  • Also, underscore versions now deprecated: "Deprecated SE versions of main verbs"dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous." – Arthur Yip Apr 14 '18 at 13:04
  • There is probably an answer here...[https://stackoverflow.com/a/40164111/7033572](https://stackoverflow.com/a/40164111/7033572). Also recommend watching this tutorial on tidy evaluation. [https://www.rstudio.com/resources/videos/tidy-eval-programming-with-dplyr-tidyr-and-ggplot2/](https://www.rstudio.com/resources/videos/tidy-eval-programming-with-dplyr-tidyr-and-ggplot2/) – Jacob Nelson Apr 15 '18 at 21:49

1 Answers1

7

As mentioned in the comments, the underscore versions of dplyr verbs are now deprecated. The correct approach is to use quasiquotation.

To address your issue with select, you simply need to modify select_expr to contain multiple expressions:

## I renamed your variables to *_str because they are, well, strings.
filter_str <- "am == 1"
mutate_str <- "gear_carb = gear*carb"
select_str <- "mpg; cyl"                # Note the ;

Use rlang::parse_expr to convert these strings to unevaluated expressions:

## Notice the plural parse_exprs, which parses a list of expressions
filter_expr <- rlang::parse_expr( filter_str )
mutate_expr <- rlang::parse_expr( mutate_str )
select_expr <- rlang::parse_exprs( select_str )

Given the unevaluated expressions, we can now pass them to the dplyr verbs. Writing filter( filter_expr ) won't work because filter will look for a column named filter_expr in your data frame. Instead, we want to access the expression stored inside filter_expr. To do this we use the !! operator to let dplyr verbs know that the argument should be expanded to its contents (which is the unevaluated expressions we are interested in):

mtcars %>% filter( !!filter_expr )
#     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# 4  32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1

mtcars %>% mutate( !!mutate_expr )
#     mpg cyl  disp  hp drat    wt  qsec vs am gear carb gear_carb = gear * carb
# 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4                      16
# 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4                      16
# 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1                       4
# 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1                       3

In case of select, we have multiple expressions, which is handled by !!! instead:

mtcars %>% select( !!!select_expr )
#                      mpg cyl
# Mazda RX4           21.0   6
# Mazda RX4 Wag       21.0   6
# Datsun 710          22.8   4

P.S. It's also worth mentioning that select works directly with string vectors, without having to rlang::parse_expr() them first:

mtcars %>% select( c("mpg", "cyl") )
#                      mpg cyl
# Mazda RX4           21.0   6
# Mazda RX4 Wag       21.0   6
# Datsun 710          22.8   4
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • This seems so much more complicated than using `select_( string )`. Any reason why this was changed? – Sacha Epskamp Jun 24 '19 at 07:42
  • @SachaEpskamp: My understanding is that the change was to bring everything under a common tidyeval umbrella. It's also worth pointing out that `select_( string )`, while simpler conceptually, doesn't work with arbitrary expressions provided as strings. – Artem Sokolov Jun 24 '19 at 14:03