0

Could someone explain why f1 behaves differently than f2 in this example:

library(dplyr)

f1 <- function(data, year){
  data %>% 
    filter(year == year)
}

f2 <- function(data, y){
  data %>% 
    filter(year == y)
}

f3 <- function(data, year){
  data %>% 
    filter(!!year == year)
}


df <- data.frame(year = 2000:2005)

f1(df, 2005)
#>   year
#> 1 2000
#> 2 2001
#> 3 2002
#> 4 2003
#> 5 2004
#> 6 2005
f2(df, 2005)
#>   year
#> 1 2005
f3(df, 2005)
#>   year
#> 1 2005

I know this has something to do with tidy evaluation and I had a look at the vignette on Programming with dplyr. But the example here seems somewhat different.

I see that the problem can be fixed by using !! in f3, but I am not entirely sure what happens here. I would be interested to know if this is the optimal solution to the problem and if it is recommended to always use !! in similar situations.

Phil
  • 33
  • 4
  • It's because your argument has the same name than your columns, so in the first example you are doing filter(data, 2005 = 2005) – Maël Oct 06 '22 at 08:46
  • 1
    `filter()` looks for column names first, and other variables afterwards. If it finds a match it won't keep looking. So in your `f1` it's saying "filter this data as long as the contents of the column 'year' matches the contents of the column 'year', which of course it does. In `f3` the `!!` can be thought of as specifying "the contents of the variable..." so it filters where "the contents of the column year matches the contents of the variable year". [link to technical explanation](https://adv-r.hadley.nz/quasiquotation.html#unquoting-one-argument) – Paul Stafford Allen Oct 06 '22 at 08:53
  • Thanks a lot for your response. That's really useful. Is there way to make sure that the second `year` in `year == year` is interpreted explicitly as an env-variable and *not* as a data-variable? – Phil Oct 06 '22 at 09:02

0 Answers0