7

I am trying to do an inventory calculation in R which requires a row wise calculation for each Mat-Plant combination. Here's a test data set -

df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B"
), Plant = c("P1", "P1", "P1", "P2", "P2", "P2", "P1", "P1"), 
    Day = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), UU = c(0L, 10L, 
    0L, 0L, 0L, 120L, 10L, 0L), CumDailyFcst = c(11L, 22L, 33L, 
    0L, 5L, 10L, 20L, 50L)), .Names = c("Mat", "Plant", "Day", 
"UU", "CumDailyFcst"), class = "data.frame", row.names = c(NA, 
-8L))

  Mat Plant Day  UU CumDailyFcst
1   A    P1   1   0           11
2   A    P1   2  10           22
3   A    P1   3   0           33
4   A    P2   1   0            0
5   A    P2   2   0            5
6   A    P2   3 120           10
7   B    P1   1  10           20
8   B    P1   2   0           50

I need a new field "EffectiveFcst" such that when Day = 1 then EffectiveFcst = CumDailyFcst and for following days -

enter image description here

Here's the desired output -

  Mat Plant Day  UU CumDailyFcst EffectiveFcst
1   A    P1   1   0           11            11
2   A    P1   2  10           22            22
3   A    P1   3   0           33            23
4   A    P2   1   0            0             0
5   A    P2   2   0            5             5
6   A    P2   3 120           10            10
7   B    P1   1  10           20            20
8   B    P1   2   0           50            40

I am currently using a for loop but the actual table is >300K rows so hoping to do this with tidyverse for more elegant and faster approach. Tried the following but didn't work out -

group_by(df, Mat, Plant) %>%
  mutate(EffectiveFcst = ifelse(row_number()==1, CumDailyFcst, 0)) %>%
  mutate(EffectiveFcst = ifelse(row_number() > 1, CumDailyFcst - lag(CumDailyFcst, default = 0) + max(lag(EffectiveFcst, default = 0) - lag(UU, default = 0), 0), EffectiveFcst)) %>%
  print(n = nrow(.))
Shree
  • 10,835
  • 1
  • 14
  • 36
  • it's basically `max(lag(EffectiveFcst) - lag(UU), 0)` i.e. if the difference is negative then make it 0. – Shree Oct 10 '18 at 00:40
  • I was thinking your code worked out , but wanted a bit more faster code – akrun Oct 10 '18 at 01:01
  • Is it possible you've altered cell F5? – Nettle Oct 10 '18 at 01:12
  • @akrun....no it doesn't work...this is surprisingly tricky without a loop...I feel it's not even possible without a loop... – Shree Oct 10 '18 at 01:21
  • @Nettle yes, F2, F5, and F8 are equal to corresponding `CumDailyFcst`...basically calculation restarts for every Mat-Plant combo. – Shree Oct 10 '18 at 01:23

1 Answers1

5

We can use accumulate from purrr

library(tidyverse)
df %>% 
   group_by(Mat, Plant) %>% 
   mutate(EffectiveFcst =  accumulate(CumDailyFcst - lag(UU, default = 0),  ~ 
          .y , .init = first(CumDailyFcst))[-1] ) 
# A tibble: 8 x 6
# Groups:   Mat, Plant [3]
#  Mat   Plant   Day    UU CumDailyFcst EffectiveFcst
#  <chr> <chr> <int> <int>        <int>         <dbl>
#1 A     P1        1     0           11            11
#2 A     P1        2    10           22            22
#3 A     P1        3     0           33            23
#4 A     P2        1     0            0             0
#5 A     P2        2     0            5             5
#6 A     P2        3   120           10            10
#7 B     P1        1    10           20            20
#8 B     P1        2     0           50            40
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks for the answer, but that's not the desired output. EffectiveFcst for rows 3 and 6 are wrong. The issue is that we need to get the updated EffectiveFcst from the previous row every time we "move" to the next row. – Shree Oct 10 '18 at 00:59
  • @Shree Can you check the code now. You need `accumulate` – akrun Oct 10 '18 at 01:41