6

I'm updating an old script using the deprecated dplyr::filter_() to use dplyr::filter(). But I can't get it to work for empty filter strings anymore:

Example:

library(dplyr)
my_df <- tibble::tibble(x = sample(c(0:9), 100, replace = TRUE))

deprecated filter_() works for both string and empty strings

fil1 <- "x == 5"
filter_(mydf, .dots = fil1) # works

fil2 <- NULL
filter_(mydf, .dots = fil2) # works, returns all values

NSE version works only with quoted filter values, but not with empty ones

fil1 = quo(x == 5)
filter(my_df, !!enquo(fil1)) # works

fil2 = NULL
filter(my_df, !!enquo(fil2)) 
Error: Argument 2 filter condition does not evaluate to a logical vector

fil2 = quo(NULL)
filter(my_df, !!enquo(fil2))
Error: Argument 2 filter condition does not evaluate to a logical vector

I see three possible approaches to this:

  • quote NULL differently
  • use another expression instead of NULL
  • use another argument inside filter()
Timm S.
  • 5,135
  • 6
  • 24
  • 38
  • You say it's in a script any reason you can't wrap it in a condition and test for "nullness"? – Chuck P May 15 '20 at 18:22
  • 1
    @Chuck P - sure I could, but that seems like an ugly workaround, I would have to do it in several places, and worst of all, doesn't still my curiosity as to how I can feed NULL into filter(). – Timm S. May 17 '20 at 09:39
  • No comment on what would be ugly not having seen your whole script. I have a workaround for you with your second bullet though. – Chuck P May 17 '20 at 14:30

2 Answers2

3

If you specify your filters as lists of expressions (or NULL), you can use the unquote-splice operator to effectively "paste" them as arguments to filter. Use parse_exprs() to convert strings to expressions:

fil1 <- rlang::exprs(x == 5)          # Note the s on exprs
filter(my_df, !!!fil1)                # Works

fil2 <- NULL                          # NULL
filter(my_df, !!!fil2)                # Also works

fil3 <- rlang::parse_exprs("x==5")    # Again, note the plural s
filter(my_df, !!!fil3)                # Also works

The first and the third calls are effectively filter(my_df, x==5), while the second call is effectively filter(my_df,).

Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
2

If I understand you correctly @timm-s the second bullet option means I can offer this solution.

set.seed(2020)
library(dplyr)

my_df <- tibble::tibble(x = sample(c(0:9), 100, replace = TRUE))

fil1 <- quo(x == 5)
filter(my_df, !!enquo(fil1)) # works
#> # A tibble: 11 x 1
#>        x
#>    <int>
#>  1     5
#>  2     5
#>  3     5
#>  4     5
#>  5     5
#>  6     5
#>  7     5
#>  8     5
#>  9     5
#> 10     5
#> 11     5

fil2 <- TRUE
filter(my_df, !!enquo(fil2)) 
#> # A tibble: 100 x 1
#>        x
#>    <int>
#>  1     6
#>  2     5
#>  3     7
#>  4     0
#>  5     0
#>  6     3
#>  7     9
#>  8     5
#>  9     0
#> 10     7
#> # … with 90 more rows

It simply relies on the fact that filter relies on true/false so instead of telling it nothing tell it true. For me the real question was why filter_ thought NULL was true LOL.

A little more playing revealed it's possible to simplify more for the empty case

fil3 <- TRUE
filter(my_df, fil3) 

will also work but may not fit your circumstances.

Chuck P
  • 3,862
  • 3
  • 9
  • 20