7

I'm modifying nested data frames inside of foo with map2 and mutate, and I'd like to name a variable in each nested data frame according to foo$name. I'm not sure what the proper syntax for nse/tidyeval unquotation would be here. My attempt:

library(tidyverse)

foo <- mtcars %>%
  group_by(gear) %>%
  nest %>%
  mutate(name = c("one", "two", "three")) %>%
  mutate(data = map2(data, name, ~
                       mutate(.x, !!(.y) := "anything")))
#> Error in quos(...): object '.y' not found

I want the name of the newly created variable inside the nested data frames to be "one", "two", and "three", respectively.

I'm basing my attempt off the normal syntax I'd use if I was doing a normal mutate on a normal df, and where name is a string:

name <- "test"
mtcars %>% mutate(!!name := "anything") # works fine

If successful, the following line should return TRUE:

foo[1,2] %>% unnest %>% names %>% .[11] == "one"
lost
  • 1,483
  • 1
  • 11
  • 19

2 Answers2

7

This seems to be a feature/bug (not sure, see linked GitHub issue below) of how !! works within mutate and map. The solution is to define a custom function, in which case the unquoting works as expected.

library(tidyverse)

custom_mutate <- function(df, name, string = "anything")
    mutate(df, !!name := string)

foo <- mtcars %>%
  group_by(gear) %>%
  nest %>%
  mutate(name = c("one", "two", "three")) %>%
  mutate(data = map2(data, name, ~
      custom_mutate(.x, .y)))

foo[1,2] %>% unnest %>% names %>% .[11] == "one"
#[1] TRUE

You find more details on GitHub under issue #541: map2() call in dplyr::mutate() error while standalone map2() call works; note that the issue has been closed in September 2018, so I am assuming this is intended behaviour.


An alternative might be to use group_split instead of nest, in which case we avoid the unquoting issue

nms <- c("one", "two", "three")

mtcars %>%
    group_split(gear) %>%
    map2(nms, ~.x %>% mutate(!!.y := "anything"))
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • I'm not entirely sure; I had a feeling that this had to do with `map`/`map2` and NSE, so I searched for something like "dynamic column name" in combination with `map2` and `mutate`; the link to the GitHub issue was among the first set of hits. – Maurits Evers Apr 10 '19 at 06:13
  • BTW I know that the code you give is just a toy example, but something like `mutate(name = c("one", "two", "three"))` can potentially be quite dangerous, if you end up with more (or less) than 3 groups after `nest`. – Maurits Evers Apr 10 '19 at 06:17
  • it will throw an error if your vector is not either length 1 or the same length as the dataframe. – lost Apr 10 '19 at 06:43
  • @lost Exactly, so that's not very robust. I guess you could wrap it in a `tryCatch` environment but it might be better to have a check somewhere to ensure that `nms <- c("one", "two", "three"); length(nms) == length(unique(mtcars$gear))`. – Maurits Evers Apr 10 '19 at 07:53
  • 2
    @lost PS: I've added an alternative approach using the new `dplyr::group_split` to split the data by group and then operate on the individual `list` elements with `map2`. – Maurits Evers Apr 10 '19 at 08:04
  • right, I wouldn't have a line like that in my actual code. Was just for creating the example. – lost Apr 10 '19 at 20:07
  • the `group_split` alternative has some issues; you end up with just a list of the nested data frames, and you lose the rest of `foo`. Maybe there's an easy fix for this. I don't have a lot of experience with `split`ing. – lost Apr 10 '19 at 20:25
  • @lost You're not loosing any data. Instead of producing a `nest`ed `tibble` by `mtcars$gear` you have a `list` of `tibble`s split by `mtcars$gear`. Depending on your downstream data processing it's as easy to operate on the `list` as on the `nest`ed `tibble`. It boils down to personal preference. – Maurits Evers Apr 11 '19 at 03:15
4

This is because of the timing of unquoting. Nesting tidy eval functions can be a bit tricky because it is the very first tidy eval function that processes the unquoting operators.

Let's rewrite this:

mutate(data = map2(data, name, ~ mutate(.x, !!.y := "anything")))

to

mutate(data = map2(data, name, function(x, y) mutate(x, !!y := "anything")))

The x and y bindings are only created when the function is called by map2(). So when the first mutate() runs, these bindings don't exist yet and you get an object not found error. With the formula it's a bit harder to see but the formula expands to a function taking .x and .y arguments so we have the same problem.

In general, it's better to avoid complex nested logic in your code because it makes it harder to read. With tidy eval that's even more complexity, so best do things in steps. As an added bonus, doing things in steps requires creating intermediate variables which, if well named, help understand what the function is doing.

Lionel Henry
  • 6,652
  • 27
  • 33
  • 1
    While I agree with your general advice to "do things in steps", I'd say that this is hardly a complex nested function; even more so, I'd speculate that this could be quite a common situation that one finds oneself in when working with `nest`ed data. I certainly have done (or been wanting to do) similar things. It might be worth pointing out that the issue can be avoided if we split the `data.frame`/`tibble` with `split` (or the new `dplyr::group_split`) first and then operate on the individual `list` elements, see the update to my post. – Maurits Evers Apr 10 '19 at 08:03
  • @lionel I agree with Maurits -- I don't quite see how this usage is much more complex than the "typical" `tidyeval` usage that's covered in R4DS, the vignettes, etc. It seems like a fairly natural way to use nesting and mapping. I'd be curious to see what steps you'd prefer for this operation. – lost Apr 10 '19 at 20:08
  • I think nested mutates are complex but I accept that others might not find it so. In any case, it makes tidy eval semantics trickier. @lost The step I recommend is to give a name to the anonymous function. – Lionel Henry Apr 11 '19 at 09:03
  • 1
    Maybe we can find a solution to make things more in line with intuition here. This keeps tripping tidy eval users :/ – Lionel Henry Apr 11 '19 at 09:07
  • @LionelHenry it looks like this solution no longer works... how can I get this working with the new `{{}}` syntax? – lost Mar 09 '21 at 07:13
  • I didn't provide any solution here, I just showed why it fails. The solution is to pull the function out of the `mutate()` call. The next major version of rlang might fix it more generally though: https://github.com/r-lib/rlang/issues/845 – Lionel Henry Mar 09 '21 at 09:14