13

I would like to be able to define arguments for dplyr verbs

condition <- "dist > 50"

and then use these strings in dplyr functions :

require(ggplot2)
ds <- cars
ds1 <- ds %>%
   filter (eval(condition))
ds1

But it throws in error

Error: filter condition does not evaluate to a logical vector. 

The code should evaluate as:

  ds1<- ds %>%
     filter(dist > 50)
  ds1

Resulting in :

ds1

   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
5     18   76
6     18   84
7     19   68
8     20   52
9     20   56
10    20   64
11    22   66
12    23   54
13    24   70
14    24   92
15    24   93
16    24  120
17    25   85

Question:

How to pass a string as an argument in a dplyr verb?

JJJ
  • 1,009
  • 6
  • 19
  • 31
andrey
  • 2,029
  • 2
  • 18
  • 23

3 Answers3

15

In the next version of dplyr, it will probably work like this:

condition <- quote(dist > 50)

mtcars %>%
   filter_(condition)
hadley
  • 102,019
  • 32
  • 183
  • 245
  • can't wait. dplyr keeps surprising me with its intuitiveness and intelligence. Thanks, Hadley! – andrey Jul 10 '14 at 06:15
  • And what if one wants to pass multiple arguments? Passing a list like `list("dist > 50", "speed > 10")` returns `Error: Can't convert a list to a quosure` – rvrvrv Jun 01 '19 at 14:03
  • EDIT: found it: `paste(list('dist > 50', 'speed > 10'), collapse=" & ")` – rvrvrv Jun 01 '19 at 15:14
5

Since these 2014 answers, two new ways are possible using rlang's quasiquotation.

Conventional hard-coded filter statement. For the sake of comparison, the statement dist > 50 is included directly in dplyr::filter().

library(magrittr)

# The filter statement is hard-coded inside the function.
cars_subset_0 <- function( ) {
  cars %>%
    dplyr::filter(dist > 50)
}
cars_subset_0()

results:

   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
...
17    25   85

rlang approach with NSE (nonstandard evaluation). As described in the Programming with dplyr vignette, the statement dist > 50 is processed by rlang::enquo(), which "uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure". Then rlang's !! unquotes the input "so that it’s evaluated immediately in the surrounding context".

# The filter statement is evaluated with NSE.
cars_subset_1 <- function( filter_statement ) {
  filter_statement_en <- rlang::enquo(filter_statement)
  message("filter statement: `", filter_statement_en, "`.")

  cars %>%
    dplyr::filter(!!filter_statement_en)
}
cars_subset_1(dist > 50)

results:

filter statement: `~dist > 50`.
<quosure>
expr: ^dist > 50
env:  global
   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
17    25   85

rlang approach passing a string. The statement "dist > 50" is passed to the function as an explicit string, and parsed as an expression by rlang::parse_expr(), then unquoted by !!.

# The filter statement is passed a string.
cars_subset_2 <- function( filter_statement ) {
  filter_statement_expr <- rlang::parse_expr(filter_statement)
  message("filter statement: `", filter_statement_expr, "`.")

  cars %>%
    dplyr::filter(!!filter_statement_expr)
}
cars_subset_2("dist > 50")

results:

filter statement: `>dist50`.
   speed dist
1     14   60
2     14   80
3     15   54
4     18   56
...
17    25   85

Things are simpler with dplyr::select(). Explicit strings need only !!.

# The select statement is passed a string.
cars_subset_2b <- function( select_statement ) {
  cars %>%
    dplyr::select(!!select_statement)
}
cars_subset_2b("dist")
wibeasley
  • 5,000
  • 3
  • 34
  • 62
3

While they're working on that, here is a workaround using if:

library(dplyr)
library(magrittr)

ds <- data.frame(attend = c(1:5,NA,7:9,NA,NA,12))

filter_na <- FALSE

filtertest <- function(x,filterTF = filter_na){
  if(filterTF) x else !(x)
}

ds %>%
  filter(attend %>% is.na %>% filtertest)

  attend
1      1
2      2
3      3
4      4
5      5
6      7
7      8
8      9
9     12

filter_na <- TRUE
ds %>%
  filter(attend %>% is.na %>% filtertest)

  attend
1     NA
2     NA
3     NA
AndrewMacDonald
  • 2,870
  • 1
  • 18
  • 31
  • thanks, @AndrewMacDonald! sorry, for not offering a reproducible example earlier – andrey Jul 08 '14 at 16:55
  • great, thanks @AndrewMacDonald, this works and on top gives me a simple example of using function with dplyr - something i wanted to have for reference. Thanks again! – andrey Jul 08 '14 at 20:08
  • 1
    glad it was useful! I edited it above very slightly (one shouldn't use `$` within `filter`! – AndrewMacDonald Jul 08 '14 at 22:29
  • I didn't know about $ and didn't notice malfunction (or didn't realize it was due to that?). Thanks for the heads up, i'll keep this mind. – andrey Jul 10 '14 at 06:09