How to use tidy evaluation with column name as strings?

Question

I've read most of the documentation about tidy evaluation and programming with dplyr but cannot get my head around this (simple) problem.

I want to programm with dplyr and give column names as strings as input to the function.

df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5),
  b = sample(5)
)

my_summarise <- function(df, group_var) {
  df %>%
    group_by(group_var) %>%
    summarise(a = mean(a))
}

my_summarise(df, 'g1')

This gives me Error : Column 'group_var' is unknown.

What must I change inside the my_summarise function in order to make this work?

score 4 · Answer 1 · answered Apr 21 '20 at 16:47

4

Convert the string column name to a bare column name using as.name() and then use the new {{}} (read Curly-Curly) operator as below:

library(dplyr)

df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5),
  b = sample(5)
)

my_summarise <- function(df, group_var) {

  grp_var <- as.name(group_var)

  df %>%
    group_by({{grp_var}}) %>%
    summarise(a = mean(a))
}

my_summarise(df, 'g1')

answered Apr 21 '20 at 16:47

Vishal Katti

532
2
6

Will this function then still work if I input the variable directly and not as a string? Or is a type catching needed in this function? – Fnguyen Apr 21 '20 at 17:11
if you are going to give the column name directly without quotes, then you don't need the line `grp_var <- as.name(group_var)`. You new function would be as follows: ``` my_summarise <- function(df, group_var) { df %>% group_by({{group_var}}) %>% summarise(a = mean(a)) } ``` – Vishal Katti Apr 21 '20 at 17:25
yes but suppose I want to "proof" this function against both types of arguments (string and literal), would the ```as.name``` still work or would I need to implement a type check at the beginning of the function? This is purely out of curiosity as I am not OP. – Fnguyen Apr 21 '20 at 17:50
you can use `ensym` to cover both cases. you will need to use the `!!` (bang-bang) operator in combination with `ensym` from rlang package. – Vishal Katti Apr 22 '20 at 10:06

score 3 · Accepted Answer · answered Apr 21 '20 at 19:09

3

We can use also ensym with !!

my_summarise <- function(df, group_var) {


  df %>%
    group_by(!!rlang::ensym(group_var)) %>%
    summarise(a = mean(a))
   }

my_summarise(df, 'g1')

Or another option is group_by_at

my_summarise <- function(df, group_var) {


      df %>%
        group_by_at(vars(group_var)) %>%
        summarise(a = mean(a))
       }

my_summarise(df, 'g1')

answered Apr 21 '20 at 19:09

akrun

874,273
37
540
662

2

All the answers are good. I prefer to use `ensym` because it accepts arguments with and without quotes. Thanks! – Стив Риедо Apr 22 '20 at 08:59

score 1 · Answer 3 · answered Apr 22 '20 at 01:32

1

You can also use sym and !!

my_summarise <- function(df, group_var) {


  df %>%
    group_by(!!sym(group_var)) %>%
    summarise(a = mean(a))
   }

my_summarise(df, 'g1')

# A tibble: 2 x 2
     g1     a
  <dbl> <dbl>
1     1  3.5 
2     2  2.67

answered Apr 22 '20 at 01:32

Kay

2,057
3
20
29

How to use tidy evaluation with column name as strings?

3 Answers3