What is best practice for using variable names as function arguments in the right-hand side of dplyr verbs such as mutate
?
To apply a function to a variable in a tibble with mutate
is trivial if we use the column name directly:
mtcars %>%
select(mpg) %>%
mutate(mpg = as.character(mpg)) %>%
glimpse()
# Observations: 32
# Variables: 1
# $ mpg <chr> "21", "21", "22.8", "21.4", "18.7", "18.1", "14.3", "24.4", "22.8", "19.2", "17.8", "16.4", "17.3", "15.2", ...
What if we don't know the column name, but it's stored in a variable as a string? As a simple example, let's store the string "mpg" in var_of_interest
, and then change all the values to "5". This is fine with a simple assignment on the right-hand side of the mutate
expression (e.g. assigning 5 to all values), if we unquote with !!
and :=
:
var_of_interest <- colnames(mtcars)[1]
glimpse(var_of_interest)
# chr "mpg"
mtcars %>% select(var_of_interest) %>%
mutate(!! var_of_interest := 5) %>%
glimpse()
# Observations: 32
# Variables: 1
# $ mpg <dbl> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
How should we use var_of_interest
within a function in the right-hand side of the mutate
call? For example, inside as.character
. The following will work, by unquoting as.name(var_of_interest)
. (Thanks to this SO post and answer, using as.symbol.)
mtcars %>%
select(var_of_interest) %>%
mutate(!! var_of_interest := as.character(!! as.name(var_of_interest))) %>%
glimpse()
# Observations: 32
# Variables: 1
# $ mpg <chr> "21", "21", "22.8", "21.4", "18.7", "18.1", "14.3", "24.4", "22.8", "19.2", "17.8", "16.4", "17.3", "15.2", ...
The following, using rlang::sym
, also works (thanks to this SO post and answer):
mtcars %>%
select(var_of_interest) %>%
mutate(!! var_of_interest := as.character(!! rlang::sym(var_of_interest))) %>%
glimpse()
# Observations: 32
# Variables: 1
# $ mpg <chr> "21", "21", "22.8", "21.4", "18.7", "18.1", "14.3", "24.4", "22.8", "19.2", "17.8", "16.4", "17.3", "15.2", ...
Are there any drawbacks to either of these methods? Should we be using quo
or enquo
instead? The latter are discussed in this vignette but I'm struggling to understand their usage and I'd love to hear more about this.