3

Input dplyr::filter to function

How to create a function which takes any dplyr::filter as input, and returns the number of rows satisfying the filter?

I have tried something like this, which does not work:

library(tidyverse)

filter_function <- function(dataset, filter_text) {
    dataset %>% filter_text %>% nrow() -> n_rows_satisfy_filter

    paste0( "Number of rows satisfying the filter: ", n_rows_satisfy_filter)
}

Here i try to input a filter as a string:

filter_function(iris, "filter( Sepal.Length > 5 & Species == 'setosa' )" )

Gives error:

Error in quote(., filter_text) : 
  2 arguments passed to 'quote' which requires 1 

The question is similar but not a duplicate of Using dplyr filter() in programming, because the current question attempts to vary the whole filter, not just the input to a static filter.

Rasmus Larsen
  • 5,721
  • 8
  • 47
  • 79

2 Answers2

3

Try this code, eval evaluates the expr argument in the environment specified by envir and returns the computed value.

library(tidyverse)

filter_function <- function(dataset, filter_text) {
  n_rows_satisfy_filter <- eval(parse(text = filter_text), envir = dataset) %>% nrow()
  paste0( "Number of rows satisfying the filter: ", n_rows_satisfy_filter)
}

filter_function(iris, "filter(dataset, Sepal.Length > 5 & Species == 'setosa' )" )
myincas
  • 1,500
  • 10
  • 15
1

With tidyverse, another option is parse_expr from rlang

library(dplyr)
filter_function <- function(dataset, filter_text) {
  eval(rlang::parse_expr(filter_text)) %>% 
         nrow() %>%
         paste0( "Number of rows satisfying the filter: ", .)
}

filter_function(iris, "filter(dataset, Sepal.Length > 5 & Species == 'setosa' )" )
#[1] "Number of rows satisfying the filter: 22"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Is it possible to avoid hardcoding the dataset into the string, and still use tidyverse? I.e something like `dataset %>% eval(parse_expr(filter_text))`, where filter_text is something like `filter( Sepal.Length > 5 & Species == 'setosa' )`? – Rasmus Larsen Nov 29 '17 at 18:43
  • 1
    @RasmusLarsen It needs the environment of the dataset – akrun Nov 30 '17 at 03:40