-1

I'm trying to write a function that can pass either one or more arguments as variables to dplyr functions. I'd like to understand how to do it generally. Programming with dplyr doesn't seem to cover the issue and some of the more autoritative documentation that you can find with Google or in vignettes (ie ?`!!!`) seems to be outdated. Below I provide an example of what I'm trying to get and where I fail:

df <- as_tibble(mtcars)


df %>% group_by(cyl, gear) %>% summarise(mpg = mean(mpg)) #This is the result I want from a function


testfunc <- function(variables) {
  df %>% group_by({{variables}}) %>% summarise(mpg = mean(mpg))
}

testfunc(cyl)                                             #It works for a single variable

testfunc(cyl, gear)                                       #It fails for multiple variables

testfunc2 <- function(variables) {                        #Second attempt
  variables <- enquos(variables)
  df %>% group_by(!!variables) %>% summarise(mpg = mean(mpg))
}

testfunc2(cyl, gear)                                       #Unused argument error
testfunc2(c(cyl, gear))                                    #Doesn't work
testfunc2(c("cyl", "gear"))                                #Doesn't work

How do you solve this problem in a general way?

Thanks!!

s_a
  • 885
  • 3
  • 9
  • 22
  • 3
    The link you provided does cover this. Use `testfunc <- function(...) {df %>% group_by(...) %>% summarise(mpg = mean(mpg))}`. But you need to decide how you are passing in variable. You can't expect both character values and symbols to work the same. You can't pass character values to `group_by` directly. – MrFlick Aug 23 '21 at 00:55
  • Thanks, you're right I missed it and it was it. – s_a Aug 23 '21 at 14:34

2 Answers2

1

You can define a arg for the data.frame and add the ... for others variables to group by

testfunc <- function(df,...) {
  df %>%
    group_by(...) %>%
    summarise(mpg = mean(mpg))
}
testfunc(mtcars,cyl,gear)  
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
1

This is covered in detail in "Advanced R" written by Hadley Wickham (https://adv-r.hadley.nz/ / https://amzn.to/2WoabjB) e.g. Chapters 6, 19, 20:

library(tidyverse)
df <- as_tibble(mtcars)
df %>% group_by(cyl, gear) %>% summarise(mpg = mean(mpg))
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
#> # A tibble: 8 x 3
#> # Groups:   cyl [3]
#>     cyl  gear   mpg
#>   <dbl> <dbl> <dbl>
#> 1     4     3  21.5
#> 2     4     4  26.9
#> 3     4     5  28.2
#> 4     6     3  19.8
#> 5     6     4  19.8
#> 6     6     5  19.7
#> 7     8     3  15.0
#> 8     8     5  15.4

testfunc <- function(variables) {
  df %>% group_by({{variables}}) %>% summarise(mpg = mean(mpg))
}

testfunc(cyl)
#> # A tibble: 3 x 2
#>     cyl   mpg
#>   <dbl> <dbl>
#> 1     4  26.7
#> 2     6  19.7
#> 3     8  15.1
testfunc(cyl, gear)
#> Error in testfunc(cyl, gear): unused argument (gear)

# https://adv-r.hadley.nz/functions.html?q=...#fun-dot-dot-dot
testfunc2 <- function(variables, ...) {                        #Second attempt
  variables <- enquos(variables, ...)
  df %>% group_by(!!!variables) %>% summarise(mpg = mean(mpg))
}

testfunc2(cyl, gear)
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
#> # A tibble: 8 x 3
#> # Groups:   cyl [3]
#>     cyl  gear   mpg
#>   <dbl> <dbl> <dbl>
#> 1     4     3  21.5
#> 2     4     4  26.9
#> 3     4     5  28.2
#> 4     6     3  19.8
#> 5     6     4  19.8
#> 6     6     5  19.7
#> 7     8     3  15.0
#> 8     8     5  15.4

Created on 2021-08-23 by the reprex package (v2.0.0)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46