2

I have the following dataset:

my.df <- data.frame(my_function=rep(c("Var1+Var 2","Var 2-Var1","(Var 2-(Var 2-Var1))/Var 2"), 1),
                    `Var1`=rep(1:1,3), 
                    `Var 2`=rep(5:5,3), check.names = FALSE)

my.df
#                  my_function Var1 Var 2
# 1                 Var1+Var 2    1     5
# 2                 Var 2-Var1    1     5
# 3 (Var 2-(Var 2-Var1))/Var 2    1     5

And I want to use column named my_function to calculate the values for each row into a new column called outcome

The outcome would be: 1+5=6,5-1=4,(5-(5-1))/5=0.2 for each of the rows.

EDIT Correct answers also reference the following original dataset:

my.df <- data.frame(my_function=rep(c("1000+2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))
J. Doe.
  • 1,255
  • 1
  • 12
  • 25
  • 4
    This would be massively easier if your code used valid R variable names instead of numbers. — But, more importantly: please provide some more background information: why are you doing this, where is the data coming from, etc? This is important for finding the most appropriate solution in your code (especially since evaluating arbitrary code provided externally is usually a *big* no-no, for reasons of efficiency as well as safety). – Konrad Rudolph Nov 07 '22 at 13:11
  • Hi Konrad - I appreciate the help. I updated the calculation. I moved from 1000 and 2000 to "Var1" and "Var 2". "Var 2" is by choice. – J. Doe. Nov 07 '22 at 13:15
  • Is it always 2 vars? – zx8754 Nov 07 '22 at 13:20
  • 1
    It has a space because I have many different named variables in a wide format and the calculations are considerably more complex and I need it to work for more complex names. Basically the function names are fixed and I cannot change them. – J. Doe. Nov 07 '22 at 13:20
  • @zx8754 No there are ~100 variables. – J. Doe. Nov 07 '22 at 13:21
  • I think you need `check.names = FALSE)` in your original dataset example, to avoid X prefixes. – zx8754 Nov 07 '22 at 13:26
  • Just to clarify: Do you want to show the whole equaiton as a result `1+5=6` or just the actual result on the righthand side of the equation `6`? – TimTeaFan Nov 07 '22 at 15:31
  • your function is a string, why not keep it as a function? – jangorecki Nov 08 '22 at 13:19

5 Answers5

2

Loop through my_function, then loop through column names gsub with value, finally evil parse:

vars <- colnames(my.df)[ -1 ]

sapply(seq(nrow(my.df)), function(i){
  res <- my.df[i, 1]
  for(v in vars){
    res <- gsub(v, my.df[i, v], res, fixed = TRUE)
  }
  eval(parse(text = res))
})
# [1] 6.0 4.0 0.2

Note:

fortunes::fortune("answer is parse")
# If the answer is parse() you should usually rethink the question.
#    -- Thomas Lumley
#       R-help (February 2005)
zx8754
  • 52,746
  • 12
  • 114
  • 209
0

A solution could be:

my.df <- data.frame(my_function=rep(c("1000+2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))

my.df
#>               my_function X1000 X2000
#> 1               1000+2000     1     5
#> 2               2000-1000     1     5
#> 3 (2000-(2000-1000))/2000     1     5

my.df$my_function = gsub("1000", "X1000", my.df$my_function)
my.df$my_function = gsub("2000", "X2000", my.df$my_function)

my.df$outcome = sapply(split(my.df, 1:NROW(my.df)), function(x)
  eval(str2lang(x$my_function),x))

my.df
#>                   my_function X1000 X2000 outcome
#> 1                 X1000+X2000     1     5     6.0
#> 2                 X2000-X1000     1     5     4.0
#> 3 (X2000-(X2000-X1000))/X2000     1     5     0.2

However you should read the comments since there are security concerns about evaluating arbitrary code. See https://stackoverflow.com/a/18391779/6912817 for case.

Ric
  • 5,362
  • 1
  • 10
  • 23
0

As expressed in the comments, I don't love parsing code from text, especially is the code text was generated through some user input. Here is, in my opinion, a safe way to evaluate these expressions:

library(tidyverse)

my.df <- data.frame(my_function=rep(c("1000+2000","2000-1000","(2000-(2000-1000))/2000"), 1), `1000`=rep(1:1,3), `2000`=rep(5:5,3))

my.df |>
  mutate(sub_function = pmap_chr(list(my_function, X1000, X2000),
                                 ~gsub(pattern = "1000", 
                                      replacement = ..2,
                                      x = ..1) |> 
                                   gsub(pattern = "2000",
                                       replacement = ..3)),
         eval = map_chr(sub_function, ~as.character(Ryacas::yac_symbol(.x))))
#>               my_function X1000 X2000 sub_function eval
#> 1               1000+2000     1     5          1+5    6
#> 2               2000-1000     1     5          5-1    4
#> 3 (2000-(2000-1000))/2000     1     5  (5-(5-1))/5  1/5
AndS.
  • 7,748
  • 2
  • 12
  • 17
0

Using rlang and purrr::pmap_dbl():

library(rlang)
library(purrr)

my.df$outcome <- pmap_dbl(
  my.df,
  \(my_function, Var1, Var2, ...) {
    eval(parse_expr(enexpr(my_function)))
  }
)

my.df
              my_function Var1 Var2 outcome
1               Var1+Var2    1    5     6.0
2               Var2-Var1    1    5     4.0
3 (Var2-(Var2-Var1))/Var2    1    5     0.2
zephryl
  • 14,633
  • 3
  • 11
  • 30
0

Here is another approach using bquote and deparse. Since your example data uses integers I first transform those to numeric to get rid of the L in the output.

my.df <- data.frame(
  my_function = rep(c("Var1+Var 2",
                      "Var 2-Var1",
                      "(Var 2-(Var 2-Var1))/Var 2"),
                    1),
  `Var1` = rep(1:1,3),
  `Var 2` = rep(5:5,3),
  check.names = FALSE)

library(dplyr)
library(stringr)

my.df %>% 
  mutate(across(starts_with("Var"), as.double)) %>%
  rowwise() %>% 
  mutate(outcome = str_replace_all(my_function,
                                   "(Var\\s{0,1}[0-9]+)",
                                   '.(.data[["\\1"]])') %>% 
           paste0("bquote(", ., ")") %>%
           str2lang %>%
           eval %>%
           list,
         outcome = paste0(deparse(outcome), " = ", res = eval(outcome)))

#> # A tibble: 3 x 4
#> # Rowwise: 
#>   my_function                 Var1 `Var 2` outcome              
#>   <chr>                      <dbl>   <dbl> <chr>                
#> 1 Var1+Var 2                     1       5 1 + 5 = 6            
#> 2 Var 2-Var1                     1       5 5 - 1 = 4            
#> 3 (Var 2-(Var 2-Var1))/Var 2     1       5 (5 - (5 - 1))/5 = 0.2

Created on 2022-11-07 by the reprex package (v2.0.1)

TimTeaFan
  • 17,549
  • 4
  • 18
  • 39