Based on the section regarding capturing multiple arguments in Programming with dplyr, I am trying to specify
multiple variables to group by in
dplyr::group_by
without relying on
...
but using an explicit list argumentgroup_vars
insteadwithout needing to quote the list elements in arg
group_vars
Example data
df <- tibble::tribble(
~a, ~b, ~c,
"A", "a", 10,
"A", "a", 20,
"A", "b", 1000,
"B", "a", 5,
"B", "b", 1
)
Approach based on ...
from Programming with dplyr
# Approach 1 -----
my_summarise <- function(df, ...) {
group_vars <- dplyr::enquos(...)
df %>%
dplyr::group_by(!!!group_vars) %>%
dplyr::summarise(x = mean(c))
}
my_summarise(df, a, b)
#> # A tibble: 4 x 3
#> # Groups: a [2]
#> a b x
#> <chr> <chr> <dbl>
#> 1 A a 15
#> 2 A b 1000
#> 3 B a 5
#> 4 B b 1
Approach based on list argument with quoted elements:
# Approach 2 -----
my_summarise_2 <- function(df, group_vars = c("a", "b")) {
group_vars <- dplyr::syms(group_vars)
df %>%
dplyr::group_by(!!!group_vars) %>%
dplyr::summarise(x = mean(c))
}
my_summarise_2(df)
#> # A tibble: 4 x 3
#> # Groups: a [2]
#> a b x
#> <chr> <chr> <dbl>
#> 1 A a 15
#> 2 A b 1000
#> 3 B a 5
#> 4 B b 1
my_summarise_2(df, group_vars = "a")
#> # A tibble: 2 x 2
#> a x
#> <chr> <dbl>
#> 1 A 343.
#> 2 B 3
I can't find an approach that lets me supply unquoted column names:
# Approach 3 -----
my_summarise_3 <- function(df, group_vars = list(a, b)) {
group_vars <- dplyr::enquos(group_vars)
df %>%
dplyr::group_by(!!!group_vars) %>%
dplyr::summarise(x = mean(c))
}
my_summarise_3(df)
#> Error: Column `list(a, b)` must be length 5 (the number of rows) or one, not 2
I guess the crucial thing is to end up with an identical list structure as the
one after calling group_vars <- dplyr::enquos(...)
:
<list_of<quosure>>
[[1]]
<quosure>
expr: ^a
env: global
[[2]]
<quosure>
expr: ^b
env: global
I tried to tackle it with group_vars %>% purrr::map(dplyr::enquo)
, but of course R complains about a
and b
as they need to be evaluated.