12

Let's say I want to filter the starwars data frame programmatically. Here's a simple example that lets me filter based on homeworld and species:

library(tidyverse)

# a function that allows the user to supply filters
filter_starwars <- function(filters) {
  for (filter in filters) {
    starwars = filter_at(starwars, filter$var, all_vars(. %in% filter$values))
  }

  return(starwars)
}

# filter Star Wars characters that are human, and from either Tatooine or Alderaan
filter_starwars(filters = list(
  list(var = "homeworld", values = c("Tatooine", "Alderaan")),
  list(var = "species", values = "Human")
))

But this doesn't let me specify, say, a height filter, because I've hard-coded the %in% operator in the .vars_predicate of filter_at(), and a height filter would use one of the >, >=, <, <=, or == operators

What is the best way to write the filter_starwars() function so that the user can supply filters that are general enough to filter along any column and use any operator?

NB using the now-deprecated filter_() method, I could pass a string:

filter_(starwars, "species == 'Human' & homeworld %in% c('Tatooine', 'Alderaan') & height > 175")

But again, that has been deprecated.

tws
  • 697
  • 7
  • 18
  • Related question discussing this issue using `filter_` - https://stackoverflow.com/questions/38493031/r-pass-a-list-of-filtering-conditions-into-a-dataframe/38493329 – thelatemail Jul 17 '17 at 00:44

2 Answers2

14

Try

filter_starwars <- function(...) {
  F <- quos(...)
  filter(starwars, !!!F)
}

filter_starwars(species == 'Human', homeworld %in% c('Tatooine', 'Alderaan'), height > 175)
# # A tibble: 7 × 13
#                  name height  mass  hair_color skin_color eye_color birth_year
#                 <chr>  <int> <dbl>       <chr>      <chr>     <chr>      <dbl>
# 1         Darth Vader    202   136        none      white    yellow       41.9
# 2           Owen Lars    178   120 brown, grey      light      blue       52.0
# 3   Biggs Darklighter    183    84       black      light     brown       24.0
# 4    Anakin Skywalker    188    84       blond       fair      blue       41.9
# 5         Cliegg Lars    183    NA       brown       fair      blue       82.0
# 6 Bail Prestor Organa    191    NA       black        tan     brown       67.0
# 7     Raymus Antilles    188    79       brown      light     brown         NA
# # ... with 6 more variables: gender <chr>, homeworld <chr>, species <chr>,
# #   films <list>, vehicles <list>, starships <list>

See https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html. Briefly, quos captures ... as a list, without evaluating the arguments. !!! splices and unquotes the arguments for evaluation in filter().

Weihuang Wong
  • 12,868
  • 2
  • 27
  • 48
  • How would we extend this solution to cases when the function argument list can have other arguments (not filters)? For instance, this would fail because the first two arguments (species and homeworld) in the call to the function will be interpretted as the arguments for `x` and `y` - `filter_starwars <- function(x = 1, y = 2, ...) { # skipping for brevity }` `filter_starwars(species == 'Human', homeworld %in% c('Tatooine', 'Alderaan'), height > 175)` – mkirzon Nov 13 '18 at 02:01
13

Here are some approaches.

1) For this particular problem we don't actually need filter_, rlang or similar. This works:

filter_starwars <- function(...) {
    filter(starwars, ...)
}

# test
filter_starwars(species == 'Human', 
                homeworld %in% c('Tatooine', 'Alderaan'), 
                height > 175)
)

2) If it is important to have character arguments then:

library(rlang)

filter_starwars <- function(...) {
    filter(starwars, !!!parse_exprs(paste(..., sep = ";")))
}

# test
filter_starwars("species == 'Human'", 
                "homeworld %in% c('Tatooine', 'Alderaan')", 
                "height > 175")

2a) or if a single character vector is to be passed:

library(rlang)

filter_starwars <- function(filters) {
    filter(starwars, !!!parse_exprs(paste(filters, collapse = ";")))
}

# test 
filter_starwars(c("species == 'Human'", 
                  "homeworld %in% c('Tatooine', 'Alderaan')", 
                  "height > 175"))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341