-1

I'm trying to use dplyr's mutate_at to subtract a numeric column's value (A1) from another corresponding numeric column (A2), I have multiple columns and several data frames I want to do for this for (BCDE..., df1:df99) so I want to write a function.

df1 <- df1 %>% mutate_at(.vars = vars(A1), .funs = funs(remainder = .-A2))

Works fine, however when I try and write a function to perform this:

REMAINDER <- function(df, numer, denom){ df <- df %>% mutate_at(.vars = vars(numer), .funs = funs(remainder = .-denom)) return(df) }

With arguments df1 <- REMAINDER(df1, A1, A2)

I get the error Error in mutate_impl(.data, dots) : Evaluation error: non-numeric argument to binary operator.

Which I don't understand as I just manually called the line of code without a function and my columns are numeric.

user126082
  • 310
  • 2
  • 9

2 Answers2

1

The vignette Programming with dplyr explains in great detail what to do:

library(dplyr)
REMAINDER <- function(df, numer, denom) {
  numer <- enquo(numer)
  denom <- enquo(denom)
  df %>% mutate_at(.vars = vars(!! numer), .funs = funs(remainder = . - !! denom))
}

df1 <- data_frame(A1 = 11:13, A2 = 3:1, B1 = 21:23, B2 = 8:6)

REMAINDER(df1, A1, A2)
# A tibble: 3 x 5
     A1    A2    B1    B2 remainder
  <int> <int> <int> <int>     <int>
1    11     3    21     8         8
2    12     2    22     7        10
3    13     1    23     6        12
REMAINDER(df1, B1, B2)
# A tibble: 3 x 5
     A1    A2    B1    B2 remainder
  <int> <int> <int> <int>     <int>
1    11     3    21     8        13
2    12     2    22     7        15
3    13     1    23     6        17

Naming the result column

The OP wants to update df1 and he wants to apply this operation to other columns as well.

Unfortunately, the REMAINDER() function as it is currently defined will overwrite the result column:

df1
# A tibble: 3 x 4
     A1    A2    B1    B2
  <int> <int> <int> <int>
1    11     3    21     8
2    12     2    22     7
3    13     1    23     6
df1 <- REMAINDER(df1, A1, A2)
df1
# A tibble: 3 x 5
     A1    A2    B1    B2 remainder
  <int> <int> <int> <int>     <int>
1    11     3    21     8         8
2    12     2    22     7        10
3    13     1    23     6        12
df1 <- REMAINDER(df1, B1, B2)
df1
# A tibble: 3 x 5
     A1    A2    B1    B2 remainder
  <int> <int> <int> <int>     <int>
1    11     3    21     8        13
2    12     2    22     7        15
3    13     1    23     6        17

The function can be modified so that the result column is individually named:

REMAINDER <- function(df, numer, denom) {
  numer <- enquo(numer)
  denom <- enquo(denom)
  result_name <- paste0("remainder_", quo_name(numer), "_", quo_name(denom))
  df %>% mutate_at(.vars = vars(!! numer),
                   .funs = funs(!! result_name := . - !! denom))
}

Now, calling REMAINDER() twice on different columns and replacing df1 after each call, we get

df1 <- REMAINDER(df1, A1, A2)
df1 <- REMAINDER(df1, B1, B2)
df1
# A tibble: 3 x 6
     A1    A2    B1    B2 remainder_A1_A2 remainder_B1_B2
  <int> <int> <int> <int>           <int>           <int>
1    11     3    21     8               8              13
2    12     2    22     7              10              15
3    13     1    23     6              12              17
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134
-1

I have used this suggestion in order to subtract pairs of columns in a list of data frames. My example has only 3 pairs of columns in each of the two data frames and it can work with higher number of columns and data frames.

dt <- data.table(A1 = round(runif(3),1), A2 = round(runif(3),1),
                 B1 = round(runif(3),1), B2 = round(runif(3),1),
                 C1 =round(runif(3),1), C2 =round(runif(3),1))

dt = list(dt,dt+dt)

lapply(seq_along(dt), function(z) {
  dt[[z]][, lapply(1:(ncol(.SD)/2), function(x) (.SD[[2*x-1]] - .SD[[2*x]]))]
})
SamPer
  • 64
  • 6