Using as.formula with a comma

Question

I'd like to get conditions dynamically from the user, so I built a shiny app that gets them from an input field. Problem is that as.formula doesn't work for a character vector with a comma (without it works fine).

Code:

all_conditions = 
  "condition1 ~ 0,
   condition2 ~ 1,
   condition3 ~ 2"

 my_dataset %>% group_by(id) %>%
  summarise(FLAG = case_when(
      as.formula(all_conditions) )
   )

I get:

Evaluation error: :2:100: unexpected ','

I have tried using paste and escaping the comma with no success.

no idea what you're trying to do. I think you need some data example to receive answers. Edit the question and add some sample data. Have you considered using `ifelse()` and the [normal R operators](https://www.statmethods.net/management/operators.html)?! Except for code conversion I do not see *any* reason to use `dplyr` here. It slows you down and in this instance creates problems where there shouldn't be any — 5th, Aug 09 '18 at 09:33
This is an issue that would tend to happen with shiny though (using text input as part of a command). The tag was relevant in my opinion. — moodymudskipper, Aug 10 '18 at 04:56

Lionel Henry · Accepted Answer · 2018-08-10T08:39:44.400

The way you are collecting the inputs is not very practical to work with. Your problem is that you are trying to parse code that looks like this:

var1, var2, var3

Try typing that in your R console, you'll get the same error:

#> Error: unexpected ',' in "var1,"

So first of all refactor your code so that you collect inputs as two vectors:

cnds <- c("condition1", "condition2", "condition3")
vals <- c("1", "2", "3")

Now you have two choices to turn these strings to R code: parsing or creating symbols. You use the former when you expect arbitrary R code and the latter when you expect variable or column names. Can you spot the differences?

rlang::parse_exprs(c("foo", "bar()", "100"))
#> [[1]]
#> foo
#>
#> [[2]]
#> bar()
#>
#> [[3]]
#> [1] 100

rlang::syms(c("foo", "bar()", "100"))
#> [[1]]
#> foo
#>
#> [[2]]
#> `bar()`
#>
#> [[3]]
#> `100`

In your case you probably need parsing because the conditions will be R code. So let's start by parsing both vectors:

cnds <- map(cnds, rlang::parse_expr)
vals <- map(vals, rlang::parse_expr)

I'm mapping parse_expr() instead of using the plural version parse_exprs() because the latter can return a list that is longer than its input. For instance parse_exprs(c("foo; bar", "baz; bam")) turns 2 strings to a list of 4 expressions. parse_expr() returns an error if a string contains more than one expression and so is more robust in our case.

Now we can map over these two lists of LHSs and RHSs and create the formulas. One simple way is to use quasiquotation to create the formulas by unquoting each LHS and its corresponding RHS:

fs <- map2(cnds, vals, function(c, v) rlang::expr(!!c ~ !!v))

The result is a list of formula expressions that is ready to be spliced into case_when():

data %>% mutate(result = case_when(!!!fs))

Use rlang::qq_show() to see exactly what the splice-unquoting is doing:

rlang::qq_show(mutate(result = case_when(!!!fs)))
#> mutate(result = case_when(condition1 ~ 1, condition2 ~2, condition3 ~ 3))

score 4 · Answer 2 · answered Aug 09 '18 at 10:09

Borrowing @phiver's example you could do:

conditions <- "gear == 3 ~ 0, gear == 4 ~ 1, TRUE ~ 2"
mtcars %>% group_by(vs) %>% 
  mutate(FLAG = eval(parse(text=sprintf("case_when(%s)",conditions))))
# # A tibble: 32 x 12
# # Groups:   vs [2]
#      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  FLAG
#    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#  1  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4     1
#  2  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4     1
#  3  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1     1
#  4  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1     0
#  5  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2     0
#  6  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1     0
#  7  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4     0
#  8  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2     1
#  9  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2     1
# 10  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4     1

The idea here is that you cannot evaluate your string alone as it's not proper syntax by itself, so we have to build a proper call around it first (here using sprintf) and then we can evaluate it on the fly (so it's evaluated in the right environment without further tricks needed).

Thanks! A different useful perspective I haven't thought of. — InterruptedException, Aug 12 '18 at 09:48

score 1 · Answer 3 · answered Aug 09 '18 at 09:41

You need to put every condition in a list and use quosures and quasiquotation (!!!) to get it to work. I will use mtcars as an example, following your code example.

library(dplyr)
# create list of quosures
conditions <- list(quo(gear == 3 ~ 0), 
     quo(gear == 4 ~ 1),
     quo(TRUE ~ 2))


mtcars %>% group_by(vs) %>% 
  mutate(FLAG = case_when(!!! conditions)) # quasiquotation using !!! to splice the list
# A tibble: 32 x 12
# Groups:   vs [2]
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  FLAG
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4     1
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4     1
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1     1
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1     0
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2     0
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1     0
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4     0
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2     1
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2     1
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4     1
# ... with 22 more rows

As I understand, OP must really start from a string as it's user input in a field of his shiny app — moodymudskipper, Aug 09 '18 at 09:45

Using as.formula with a comma

3 Answers3