1

I am trying to use dplyr in the programming way: filter behavior with quoted variables are not understandable.

After several attempts to analyze the real data I had created a following dummy data.

dt <- data.frame(
  sex = rep(c("F","M"), 50),
  height = runif(100, 1, 1000),
  weight = rep(c(2, 100), 50),
  value = runif(100, 1, 1000 ),
  stringsAsFactors =  FALSE
)



library(dplyr)


wizard_fun_1 <-  function(param1){
  par1 <- enquo(param1)

dt %>% select(height, !!par1)
}

wizard_fun_1("sex")

# as expected
#1    74.875344   F
#2   846.614856   M
#.....


wizard_fun_2 <-  function(param1){
  par1 <- enquo(param1)

  dt %>% select(height, !!par1)  %>%
    filter( (!!par1) == 'M')
}

wizard_fun_2('sex')

#[1] height sex  
# ... zero rows....

What's going wrong? Thank's in advanced for any ideas!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
NT_
  • 641
  • 1
  • 5
  • 13
  • See [here](https://stackoverflow.com/questions/44121728/programming-with-dplyr-using-string-as-input/44122936#44122936) for working with strings as input. – aosmith Sep 11 '17 at 19:27

2 Answers2

2

In the function you are using enquo, but then when you call the function you pass the column name as a string rather than the bare name. You just need to use the bare column name when calling the function and it works as written.


library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

dt <- tibble(
  sex = rep(c("F","M"), 50),
  height = runif(100, 1, 1000),
  weight = rep(c(2, 100), 50),
  value = runif(100, 1, 1000 )
)


wizard_fun_2 <-  function(param1){
  par1 <- enquo(param1)

  dt %>% select(height, !!par1)  %>%
    filter( (!!par1) == "M")
}

wizard_fun_2(sex)

#> # A tibble: 50 x 2
#>      height   sex
#>       <dbl> <chr>
#>  1 871.7788     M
#>  2 467.9220     M
#>  3 272.6478     M
#>  4 445.1101     M
#>  5 682.2095     M
#>  6 831.8522     M
#>  7 727.9525     M
#>  8 203.7829     M
#>  9 742.3000     M
#> 10 322.0473     M
#> # ... with 40 more rows
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
austensen
  • 2,857
  • 13
  • 24
1

If you are using enquo, you should be calling your function without quotes. For example

wizard_fun_2(sex)

will work just fine. The select function can take strings or symbols. That is both of these will work

select(dt, sex) # more common
select(dt, "sex")

But that's not the same for filter()

filter(sex=="M")
filter("sex"=="M")

So be careful when jumping between strings and unquoted symbols/names. When you are using quote stings, you're not using non-standard evaluation at all really.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • @aosmith Can someone show me how to actually do this using a character argument to the function specifying a variable and `filter`? None of my `quo`, `sym`, etc incantations are working. – joran Sep 11 '17 at 19:39
  • @joran: `f <- function(param1){ par1<- rlang::sym(param1); dt %>% select(height, !!(par1)) %>% filter( (!!par1) == 'M') }; f("sex")` should do the trick. – MrFlick Sep 11 '17 at 19:42
  • Ah, yikes, you have to have the parens in `(!!par1)` it seems. That seems...tricky. – joran Sep 11 '17 at 19:44
  • 1
    @joran. Ah yes. `==` has a higher precedence than `!`. Since the bang operator was never really meant to be used this way, things like this happen – MrFlick Sep 11 '17 at 19:48
  • @MrFlick, thanks for the snippet. This is exactly what I need))) Yeehh))) – NT_ Sep 12 '17 at 07:01