2

Take this function foo(). I want it to have a default argument of cyl because that's the name of the field it will usually process.

library(tidyverse)

foo <- function(x = cyl){
    case_when(
        x == 6 ~ TRUE,
        x == 8 ~ FALSE,
        x == 4 ~ NA
    )
}

# works: 
mtcars %>% 
    mutate(cyl_refactor = foo(cyl)) %>% 
    select(cyl, cyl_refactor)

But I am surprised that the function will not work unless I explicitly supply the default argument. See failing code below

# fails:
mtcars %>% 
    mutate(cyl_refactor = foo()) %>% 
    select(cyl, cyl_refactor)

Error: Problem with `mutate()` column `cyl_refactor`. ℹ `cyl_refactor = foo()`. x object 'cyl' not found

It seems that default arguments are only processed when there is also a data parameter as below.

foo2 <- function(data, x = cyl){
    data %>% 
        mutate(cyl_refactor = case_when(
        {{x}} == 6 ~ TRUE,
        {{x}} == 8 ~ FALSE,
        {{x}} == 4 ~ NA
    ))
}

mtcars %>% 
    foo2() %>% 
    select(cyl, cyl_refactor)

I am sure there is some gap in my knowledge of quasiquotation, but I would like to understand how to use a default argument in foo().

Joe
  • 3,217
  • 3
  • 21
  • 37
  • This is obviously just example code, is there a way to supply foo() with a default argument? do I need to use quoted column names, and if so can quoted column names be processed by case_when()? – Joe Sep 08 '21 at 20:12
  • In my real use case it would be better to avoid a data parameter – Joe Sep 08 '21 at 20:13
  • So when I try the sym() method I am getting an error: Only strings can be converted to symbols retain_92dv2 <- function(gns_date = 'qbcommerce_gns_date', cancel_date = 'qbcommerce_cancel_date', duration = 92){ gns_date <- sym(gns_date) cancel_date <- sym(cancel_date) dplyr::case_when( Sys.Date() - !!gns_date <= duration ~ NA, # unbaked Sys.Date() - !!gns_date > duration & is.na(!!cancel_date) ~ TRUE, # baked + no cancel = retained !!cancel_date - !!gns_date > duration ~ TRUE, !!cancel_date - !!gns_date <= duration ~ FALSE, TRUE ~ NA ) } – Joe Sep 08 '21 at 20:19
  • Is this the idea? To pass quoted arguments and then use arg <- sym(arg) followed by !!arg in the function? – Joe Sep 08 '21 at 20:21

2 Answers2

3

Here's one that will "work" though I woudn't recommend it

foo <- function(x = cyl){
  x <- enquo(x)
  eval.parent(rlang::quo_squash(rlang::quo(case_when(
    !!x == 6 ~ TRUE,
    !!x == 8 ~ FALSE,
    !!x == 4 ~ NA
  ))))
}

# Both run without error
mtcars %>% 
  mutate(cyl_refactor = foo(cyl)) %>% 
  select(cyl, cyl_refactor)

mtcars %>% 
  mutate(cyl_refactor = foo()) %>% 
  select(cyl, cyl_refactor)

The problem is that in order for case_when to work, you can't just pass in a column name without also passing in the data. In order to "find" the data in this case, I've used eval.parent() to go up the call chain to try to find the cyl variable.

It's better to make proper functions where you pass in the input data directly (rather than variable names they need to look up themselves).

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • so it sounds like using case_when() in a custom function without a data parameter is going to be problematic for default arguments that it can't "find" – Joe Sep 08 '21 at 20:24
  • 1
    @Joe It's a problem for any function when you won't pass the data directly as a parameter. Nothing special about `case_when`. If the function has a default parameter that's a variable name, rather than an actual value, it's a problem if that variable isn't defined in the function and only exists in a calling environment. That goes in conflict with the normal behavior of R to look up variables using lexical scoping. – MrFlick Sep 08 '21 at 20:27
2

We could do this with missing and cur_data_all

foo <- function(x = cyl){
   if(missing(x)) x <- cur_data_all()[["cyl"]]
   
    case_when(
        x == 6 ~ TRUE,
        x == 8 ~ FALSE,
        x == 4 ~ NA
    )
}

-testing

> out1 <- mtcars %>% 
+     mutate(cyl_refactor = foo(cyl)) %>% 
+     select(cyl, cyl_refactor)
> out2 <- mtcars %>% 
+     mutate(cyl_refactor = foo()) %>% 
+     select(cyl, cyl_refactor)
> 
> identical(out1, out2)
[1] TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662