4

I've been studying tidyeval semantics from a number of sources, but I'm getting a result I can't explain.

I'm using mutate_at and case_when to transform some variables by (1) retrieving their names using quotation, (2) modifying their names using gsub, and (3) referencing the data associated with the modified names.

In my minimal example, I'm creating foo$c as a transformation of foo$b which is meant to simply take on the value from foo$a. Steps (1) and (2) seem to be straightforward:

library(tidyverse)
library(rlang)

foo <- data.frame(a = 1, b = 2)

foo %>%
  mutate_at(vars(c = b),
            funs(case_when(
              TRUE ~ gsub("b", "a", as_name(quo(.)))
            )))
#>   a b c
#> 1 1 2 a

foo$c contains the correct name of the variable we want to look at. I understand I need to transform the string into a symbol using sym() and then evaluate it. If I was using a simple mutate(), !! and sym() work fine:

foo %>%
  mutate(c := !!sym(gsub("b", "a", as_name(quo(b)))))
#>   a b c
#> 1 1 2 1

But when I do this inside the mutate_at(case_when()) I do not get the correct result:

foo %>%
  mutate_at(vars(c = b),
            funs(case_when(
              TRUE ~ !!sym(gsub("b", "a", as_name(quo(.))))
            )))
#>   a b c
#> 1 1 2 2

To see what's going on I made a simple printing function. Without !! it looks from the printout as though gsub() and sym() are both producing the intended results:

look <- function(x) {
  print(x)
  print(typeof(x))
  return(x)
}

foo %>%
  mutate_at(vars(c = b),
            funs(case_when(
              TRUE ~ look(sym(look(gsub("b", "a", as_name(quo(.))))))
            )))
#> [1] "a"
#> [1] "character"
#> a
#> [1] "symbol"
#> Error in mutate_impl(.data, dots): Evaluation error: object of type 'symbol' is not subsettable.

Once I put !! in front, the printout seems to show that we're getting a different result for gsub() and sym():

foo %>%
  mutate_at(vars(c = b),
            funs(case_when(
              TRUE ~ !!(look(sym(look(gsub("b", "a", as_name(quo(.)))))))
            )))
#> [1] "."
#> [1] "character"
#> .
#> [1] "symbol"
#> [1] "."
#> [1] "character"
#> .
#> [1] "symbol"
#>   a b c
#> 1 1 2 2

I don't understand how adding !! can change the result from the nested sym(gsub()). Adding a new operation to the end shouldn't change the prior/interior result. I've read that !! is "not a function call, but a syntactic operation" but I don't fully appreciate that distinction or how that could change the result.

Using eval_tidy instead of !! seems to work fine, though I can't explain why:

foo %>%
  mutate_at(vars(c = b),
            funs(case_when(
              TRUE ~ eval_tidy(look(sym(look(gsub("b", "a", as_name(quo(.)))))))
            )))
#> [1] "a"
#> [1] "character"
#> a
#> [1] "symbol"
#>   a b c
#> 1 1 2 1
lost
  • 1,483
  • 1
  • 11
  • 19

1 Answers1

1

Just a few comments:

(a) Scoped verbs currently work with a weird substitution of the . pronoun. We are moving towards using functions (or lists of) that should work in a more principled way. I suggest using functions or purrr lambdas instead of funs(), this should remove some weirdness.

(b) The !! operator is all about timing. In your case it's funs() that processes it, in its immediate context, before the substitution happens.

(c) When using the scoped variants, it's best to forget about tidy eval and think in terms of mapping functions. This means you don't have access to the name of the column that is currently being mapped. We might add this as a feature in the future, but for now it's best to avoid working around this.

See also https://github.com/tidyverse/dplyr/issues/4199 for a recent related discussion.

Lionel Henry
  • 6,652
  • 27
  • 33
  • Thanks. If scoped verbs aren't intended for this usage, do you have a suggestion for an alternative way to perform this mutation in the `tidyverse` framework? (i.e., mutate a set of a columns using values from another set of columns whose names are a consistent variation of the names of the first first set of columns) – lost Feb 26 '19 at 04:08
  • I can't think of something off the top of my head, but two future features might help: an `imap()` variant that maps the column with its name, then you can subset the `.data` pronoun. Also it feels like dataframe-column support should help with this kind of thing in a more principled way. – Lionel Henry Feb 26 '19 at 12:10