0

I have a large dataset and I would like to apply some transformations in some variables programatically. To illustrate, say I want to apply the log to variables contained in a character vector. I would like to keep the input variables and generate a new variable prepending (or appending) a prefix (or suffix) for each variable of the character vector. Since a few lines of code is worth a thousand paragraphs, I basically aim to get the results as in df_aim in a less repetitive fashion, as for example, in df_syntax.

reprex

library(tidyverse)
data(mtcars)

vars_to_transf <- c("disp", "hp", "drat")

# these results 
df_aim <- mtcars %>% 
    mutate(
        ln_disp =  log(disp), 
        ln_hp   =  log(hp),
        ln_drat =  log(drat)
    )

# with something like this syntax 
df_syntax <- mtcars %>% 
    mutate(across(all_of(vars_to_transf), .fns =  log))
> head(df_aim)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb ln_disp ln_hp ln_drat
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   5.075 4.700   1.361
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   5.075 4.700   1.361
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   4.682 4.533   1.348
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   5.553 4.700   1.125
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   5.886 5.165   1.147
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   5.416 4.654   1.015
> head(df_syntax)
                   mpg cyl  disp    hp  drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6 5.075 4.700 1.361 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6 5.075 4.700 1.361 2.875 17.02  0  1    4    4
Datsun 710        22.8   4 4.682 4.533 1.348 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6 5.553 4.700 1.125 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8 5.886 5.165 1.147 3.440 17.02  0  0    3    2
Valiant           18.1   6 5.416 4.654 1.015 3.460 20.22  1  0    3    1

I appreciate your attention and apologise should this question be a duplicate.

Marcelo Avila
  • 2,314
  • 1
  • 14
  • 22

2 Answers2

2

You can use list:

mtcars %>% 
    mutate(across(vars_to_transf, list(log = log)))

And if you were attempting to use more than one function, using list and the .names will work:

mtcars %>% 
    mutate(across(vars_to_transf, 
                  list(log = log, sqrt = sqrt), 
                  .names = "{.col}_{.fn}"))
bouncyball
  • 10,631
  • 19
  • 31
1

The answer is right at the help file of ?dplyr::across(). The argument .names takes care of it.

.names

A glue specification that describes how to name the output columns. This can use {.col} to stand for the selected column name, and {.fn} to stand for the name of the function being applied. The default (NULL) is equivalent to "{.col}" for the single function case and "{.col}_{.fn}" for the case where a list is used for .fns.

mtcars %>% mutate(
    across(vars_to_transf, .fns =  log, .names = "ln_{vars_to_transf}")
)
Marcelo Avila
  • 2,314
  • 1
  • 14
  • 22