0

I have a data frame with several continuous variables. I need to generate the logarithm of each continuous variable. The names of each new variables must be made up of the original variable name adding _ln. I would like to generate a loop to do this task. The code to do it for a single variable was:

dat <- data.frame(fev1 = c(3.4, 5.8, 3.1, NA),
                  fvc = c(3.9, 6.2, 4.5, 6.0),
                  rat = c(0.8, 0.91, 0.9, NA),
                  sex = c(0, 1, 0, 1))

var <- sym('fev1')

dat <- 
  dat %>% 
  mutate('{{vari}}_ln':= log({{vari}}))

names(dat)

[1] "fev1"    "fvc"     "rat"     "sex"     "fev1_ln"

I used the following code to make a loop:

vars <- c('fev1', 'fvc', 'rat')

purrr::map(dat[vars], ~.x %>% 
             vs <- sym(~.x) %>% 
           mutate('{{vs}}_ln':= log({{vs}})))

However, it did not work. The error message was:

Error in sym(): ! Can't convert a object to a symbol.

What I want to get is a data frame with the original variables plus the new variables.

dat

fev1 fvc  rat sex  fev1_ln   fvc_ln      rat_ln
1  3.4 3.9 0.80   0 1.223775 1.360977 -0.22314355
2  5.8 6.2 0.91   1 1.757858 1.824549 -0.09431068
3  3.1 4.5 0.90   0 1.131402 1.504077 -0.10536052
4   NA 6.0   NA   1       NA 1.791759          NA

How could I make a loop for this task?

Thanks

DavidMB
  • 7
  • 3
  • it looks like you didn't update your variable names when posting the question (vari vs var, vars vs vs) – Mark Aug 25 '23 at 18:20
  • what should the output be? If you could edit your question to add the desired output that would be great! :-) – Mark Aug 25 '23 at 18:23

2 Answers2

1
dat %>%
  mutate(across(all_of(vars), log, .names = "{col}_ln"), .keep = 'none')

   fev1_ln   fvc_ln      rat_ln
1 1.223775 1.360977 -0.22314355
2 1.757858 1.824549 -0.09431068
3 1.131402 1.504077 -0.10536052
4       NA 1.791759          NA

 dat %>%
   mutate(across(all_of(vars), log, .names = "{col}_ln"), .keep = 'unused')

  sex  fev1_ln   fvc_ln      rat_ln
1   0 1.223775 1.360977 -0.22314355
2   1 1.757858 1.824549 -0.09431068
3   0 1.131402 1.504077 -0.10536052
4   1       NA 1.791759          NA

dat %>%
  mutate(across(all_of(vars), log, .names = "{col}_ln"), .keep = 'used')

  fev1 fvc  rat  fev1_ln   fvc_ln      rat_ln
1  3.4 3.9 0.80 1.223775 1.360977 -0.22314355
2  5.8 6.2 0.91 1.757858 1.824549 -0.09431068
3  3.1 4.5 0.90 1.131402 1.504077 -0.10536052
4   NA 6.0   NA       NA 1.791759          NA

dat %>%
 mutate(across(all_of(vars), log, .names = "{col}_ln"))

  fev1 fvc  rat sex  fev1_ln   fvc_ln      rat_ln
1  3.4 3.9 0.80   0 1.223775 1.360977 -0.22314355
2  5.8 6.2 0.91   1 1.757858 1.824549 -0.09431068
3  3.1 4.5 0.90   0 1.131402 1.504077 -0.10536052
4   NA 6.0   NA   1       NA 1.791759          NA
Onyambu
  • 67,392
  • 3
  • 24
  • 53
  • The option I was looking for was: `dat %>% mutate(across(all_of(vars), log, .names = "{col}_ln"))`. Thank you for solving my doubt. – DavidMB Aug 26 '23 at 12:42
0

I'm guessing what you're looking for is this:

library(tidyverse)

setNames(purrr::map_df(dat[vars], log), paste0(vars, "_ln"))

Output:

# A tibble: 4 × 3
  fev1_ln fvc_ln  rat_ln
    <dbl>  <dbl>   <dbl>
1    1.22   1.36 -0.223 
2    1.76   1.82 -0.0943
3    1.13   1.50 -0.105 
4   NA      1.79 NA     

A good first step when using a map function is to figure out what it is mapping over. In the case of a dataframe, it's mapping over the columns. Each .x or .y is a vector or list, and you don't call mutate on either of those.

I would give more specific advice, but it's hard to tell what is going on in your code sorry!!

Mark
  • 7,785
  • 2
  • 14
  • 34
  • What I am looking for is a database with the original variables plus the variables in logarithms. Anyway, the code you wrote is very useful to me. I have two questions. How should the code be modified so that the new variables are added to the original database? Can you recommend a book, course or other material to learn about purrr? Thanks in advance – DavidMB Aug 26 '23 at 12:52
  • @DavidMB re: book, course, or other material - the way I originally learned to use R was with the `swirl` tutorials: https://swirlstats.com/ there's one called Advanced R Programming which covers purrr functions https://swirlstats.com/scn/arp.html – Mark Aug 26 '23 at 16:46
  • I also think the documentation for the tidyverse packages is quite readable. If you've tried reading documentation before, and bounced off it because it was too technical, I'd give another look to the documentation for the tidyverse packages, including purrr, it's quite good – Mark Aug 26 '23 at 16:47
  • to keep the original columns, you could do something like this: `bind_cols(dat[vars], setNames(purrr::map_df(dat[vars], log), paste0(vars, "_ln")))`, though I think Onyambu's final answer is quite elegant, and probably more what you had in mind – Mark Aug 26 '23 at 16:51