1

Hello my dear friends/ teachers/ fellow R users,

I have been grappling with the idea of using accumulate family of functions from purrr for a while and recently I've come across a very tricky situation when we would like to calculate 2 new variables for every row where we have to use the other to calculate each. Here is my data set (it is an excerpt from another question I've seen):

structure(list(Scenario = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1), SlNo = c(NA, 
1L, 5L, 9L, 13L, 17L, 21L, 25L, 29L, 33L), A = c(NA, 14, 1, 17, 
8, 10, 9, 14, 14, 4), B = c(NA, 19, 13, 20, 7, 16, 12, 18, 15, 
17), C = c(4, 0, 0, 0, 0, 0, 0, 0, 0, 0), E = c(6, 0, 0, 0, 0, 
0, 0, 0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

And my desired output would be like:

   Scenario  SlNo     A     B     C      E
      <dbl> <int> <dbl> <dbl> <dbl>  <dbl>
 1        0    NA    NA    NA   4     6   
 2        1     1    14    19  32     1   
 3        1     5     1    13  35    -2   
 4        1     9    17    20  40.5  10.5 
 5        1    13     8     7  42.8  -0.25
 6        1    17    10    16  42.1   8.12
 7        1    21     9    12  46.2   8.19
 8        1    25    14    18  54.3  22.3 
 9        1    29    14    15  69.4  34.4 
10        1    33     4    17  91.1  58.1

We can of course use a for loop to calculate value for variable C and E from second row until the last row. The good thing about it is I can at the same time populate both variables at every iteration using for example the current value of E to calculate the current value of C and vice-versa:

for(i in 2:nrow(df)) {

df$C[i] <- if_else(df$Scenario[i] != 0, (1-0.5) * df$C[i-1] + 3 + 2 + df$B[i] + df$E[i-1],
              df$C[i])
df$E[i] <- if_else(df$Scenario[i] != 0, df$C[i] + df$B[i] - 50, df$E[i])

}

df

So I would be grateful to have some solutions with accumulate on how I could apply it to this sort of problems.

Thank you very much in advance

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41

2 Answers2

1

In this case, we require to generate two output columns simultaneously; and iteratively with the help of one input column and given two initial values of the two output columns. Now, purrr::accumulate normally works on one output based on one input, whereas purrr::accumulate2() works on 2 inputs for again one output. So instead, my strategy for accumulate is elaborated as under:-

For further reference you may see this answer also where multiple results have been generated in purrr::accumulate using even more than 2 arguments (input values) simultaneously

  • I presume your first row is just to input the first/initial values in remaining values of column.
  • To have multiple outputs in accumulate you can make behave this function purrr::accumulate (I must say awesome function) as generating a tibble instead of vector.
  • I removed first row.
  • used your initial values in .init argument
  • created an output in tibble format
  • used syntax tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E, E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50)
  • If you look closely, C is generated with the given formula. But,
  • For E we require generated (output) value of C therefore, I used formula of C instead of using .x or .y there
  • finally, unnested the output with using unnest_wider in tidyr
library(tidyverse, warn.conflicts = F)

df <- structure(list(Scenario = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1), SlNo = c(NA, 
                                                                          1L, 5L, 9L, 13L, 17L, 21L, 25L, 29L, 33L), A = c(NA, 14, 1, 17, 
                                                                                                                           8, 10, 9, 14, 14, 4), B = c(NA, 19, 13, 20, 7, 16, 12, 18, 15, 
                                                                                                                                                       17), C = c(4, 0, 0, 0, 0, 0, 0, 0, 0, 0), E = c(6, 0, 0, 0, 0, 
                                                                                                                                                                                                       0, 0, 0, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df", 
                                                                                                                                                                                                                                                           "tbl", "data.frame"))

df[1:4] %>%
  filter(Scenario != 0) %>%
  mutate(new = accumulate(B, 
                          .init = tibble(C = 4, E = 6),
                          ~ tibble(C = (1 -0.5)* .x$C + 5 + .y + .x$E,
                                   E = 0.5 * .x$C + 5 + .x$E + 2 * .y - 50
                                   )
                          )[-1]
         ) %>% 
  unnest_wider(new)
#> # A tibble: 9 x 6
#>   Scenario  SlNo     A     B     C     E
#>      <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1        1     1    14    19  32    1   
#> 2        1     5     1    13  35   -2   
#> 3        1     9    17    20  40.5 10.5 
#> 4        1    13     8     7  42.8 -0.25
#> 5        1    17    10    16  42.1  8.12
#> 6        1    21     9    12  46.2  8.19
#> 7        1    25    14    18  54.3 22.3 
#> 8        1    29    14    15  69.4 34.4 
#> 9        1    33     4    17  91.1 58.1

Created on 2021-07-05 by the reprex package (v2.0.0)

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
0

Here is a base R solution:

df[, c("C", "E")] <- do.call(rbind, Reduce(function(x, y) {
  data.frame(C = 0.5 * x[["C"]] + 5 + df$B[y] + x[["E"]],
             E = 0.5 * x[["C"]] + 5 + df$B[y] + x[["E"]] + df$B[y] - 50)
}, init = data.frame(C = 4, E = 6), 2:nrow(df), accumulate = TRUE))

df

# A tibble: 10 x 6
   Scenario  SlNo     A     B     C     E
      <dbl> <int> <dbl> <dbl> <dbl> <dbl>
 1        0    NA    NA    NA   4    6   
 2        1     1    14    19  32    1   
 3        1     5     1    13  35   -2   
 4        1     9    17    20  40.5 10.5 
 5        1    13     8     7  42.8 -0.25
 6        1    17    10    16  42.1  8.12
 7        1    21     9    12  46.2  8.19
 8        1    25    14    18  54.3 22.3 
 9        1    29    14    15  69.4 34.4 
10        1    33     4    17  91.1 58.1 
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41