0

I am trying to add columns using rowSums because it is a fast function, using cur_column because the select columns depend on the initial columns a_value b_value. This is a reprex, so it needs to use rowSums cur_column or something similar because it will be used on thousands of rows and ~ 16 columns.

input:

library(tidyverse)

df <- tibble(
  a_value = c(1, 2, 3),
  b_value = c(4, 5, 6),
  a_param1 = c(7, 8, 9),
  b_param1 = c(10, 11, 12),
  a_param2 = c(13, 14, 15),
  b_param2 = c(16, 17, 18),
  a_param3 = c(19, 20, 21),
  b_param3 = c(22, 23, 24),
  a_param4 = c(25, 26, 27),
  b_param4 = c(28, 29, 30)
)

result of input.

# A tibble: 3 x 10
  a_value b_value a_param1 b_param1 a_param2 b_param2 a_param3 b_param3 a_param4 b_param4
    <dbl>   <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1       1       4        7       10       13       16       19       22       25       28
2       2       5        8       11       14       17       20       23       26       29
3       3       6        9       12       15       18       21       24       27       30

desired output. The a|bvalue_value_exc_parami i+1 is the sum of a|b_parami and a|b_parami+1

# A tibble: 3 x 14
  a_value b_value a_param1 b_param1 a_param2 b_param2 a_param3 b_param3 a_param4 b_param4 a_value_exc_param_12 b_value_exc_param_12 a_value_exc_param_34 b_value_exc_param_34
    <dbl>   <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>                <dbl>                <dbl>                <dbl>                <dbl>
1       1       4        7       10       13       16       19       22       25       28                   20                   26                   44                   50
2       2       5        8       11       14       17       20       23       26       29                   22                   28                   46                   52
3       3       6        9       12       15       18       21       24       27       30                   24                   30                   48                   54

What I tried so far, it does not work:


df %>% 
  mutate(
    across(
      contains("_value"),
      list(
      param_12 = rowSums(~ select(.,
                     get(str_replace(cur_column(), "_value", "_param1")),
                     get(str_replace(cur_column(), "_value", "_param2"))
                     )
           ),
      param_34 = rowSums(~ select(.,
                       get(str_replace(cur_column(), "_value", "_param3")),
                       get(str_replace(cur_column(), "_value", "_param4"))
      )
      
      ),
      .names = "{.col}_exc_{.fn}"
           
  )
  )
  )

There might be another easy and efficient approach I'm not aware of, feel free to provide, thanks.

Alvaro Morales
  • 1,845
  • 3
  • 12
  • 21

1 Answers1

3

You can use the following :

library(dplyr)
library(stringr)

df %>% 
  transmute(across(contains("_value"),
                list(param_12 = ~rowSums(select(cur_data(),
                            str_replace(cur_column(), "_value", "_param1"),
                            str_replace(cur_column(), "_value", "_param2"))), 
                     param_34 = ~rowSums(select(cur_data(),
                            str_replace(cur_column(), "_value", "_param3"),
                            str_replace(cur_column(), "_value", "_param4")))), 
                .names = "{.col}_exc_{.fn}"))

#  a_value_exc_param_12 a_value_exc_param_34 b_value_exc_param_12 b_value_exc_param_34
#                 <dbl>                <dbl>                <dbl>                <dbl>
#1                   20                   44                   26                   50
#2                   22                   46                   28                   52
#3                   24                   48                   30                   54

I have used transmute instead of mutate here to show the output. In your real case, you can of course use mutate to append these new columns.


We can write a function to avoid repetition of code.

apply_sum <- function(data, col, val, replace) {
  rowSums(select(data, str_replace(col, val, replace)))
}

df %>% 
  transmute(across(contains("_value"),
                   list(
               param_12 = ~apply_sum(cur_data(), cur_column(), 'value', c('param1', 'param2')),
               param_34 = ~apply_sum(cur_data(), cur_column(), 'value', c('param3', 'param4'))),
                   .names = "{.col}_exc_{.fn}"))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213