0

I group the observations in a dataframe by ID. I want to keep the value of the first row in each group as it is, while fill in the remaining rows in the same group with lagged values + 2 and raise it to the power of 2. I want to update the lagged values as I proceed.

Please take the following dataset as an example:

ID <- c("1","1","1","2","2","3","3","3")
val <- c(1:8)
df <- data.frame(ID,val)
  ID val
1  1   1
2  1   2
3  1   3
4  2   4
5  2   5
6  3   6
7  3   7
8  3   8

I am expecting to see:

  ID val
1  1   1
2  1   (1 + 2)^2 = 9
3  1   (9 + 2)^2 = 121
4  2   4
5  2   (4 + 2)^2 = 36
6  3   6
7  3   (6 + 2)^2 = 64
8  3   (64 + 2)^2 = 4356

What I have tried:

df <- df %>% group_by(ID) %>% mutate(val = ifelse(row_number()!=1,(lag(val)+2)^2, val))

But what I seemed to get is:

  ID val
1  1   1
2  1   (1 + 2)^2 = 9
3  1   (2 + 2)^2 = 16
4  2   4
5  2   (4 + 2)^2 = 36
6  3   6
7  3   (6 + 2)^2 = 64
8  3   (7 + 2)^2 = 81

I guess R does not update the value in the previous row before it proceeds to the next row, so it was using the old lagged value. Is there a way to fix this? Also, I have to apply similar but more complicated calculations (a lot of exponents) on a large dataset, so if there is a quick way, it will be perfect!

Thank you!

dianaiii
  • 89
  • 4

4 Answers4

3

If you are pssing two arguments use accumulate2 instead. ..1 will represent previously iterated value, ..2 will represent first argument and ..3 will represent second argument.

Note ..2 is not used directly here but you are using it for invisble iteration only


df %>% 
  group_by(ID) %>% 
  mutate(val = accumulate2(val, pw, ~(..1+ 2)^..3)) %>%
  ungroup

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
2

You can use accumulate to perform such recursive calculation.

library(dplyr)
library(purrr)

df %>% 
  group_by(ID) %>% 
  mutate(val = accumulate(val, ~.x + 2)) %>%
  ungroup

# A tibble: 8 x 2
#  ID      val
#  <chr> <dbl>
#1 1         1
#2 1         3
#3 1         5
#4 2         4
#5 2         6
#6 3         6
#7 3         8
#8 3        10
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for your reply! I think this is exactly what I am looking for. However, I needed to change my question a bit so that exponents is involved as well and it cannot be solved with cumsum. Could you please help me further? – dianaiii Apr 18 '21 at 13:49
  • @dianaiii I am not using `cumsum` in my answer. You can solve the updated question in the same way with `accumulate` i.e `mutate(val = accumulate(val, ~(.x + 2)^2))` – Ronak Shah Apr 18 '21 at 14:02
  • I see! Thank you very much for helping to make it work. And I was referring to another answer, sorry for the confusion! – dianaiii Apr 18 '21 at 14:07
  • a follow up question, if the power of the exponent is stored in a separate column, say column "pw", how can we incorporate that into `mutate`? I tried `mutate(val = accumulate(val, ~(.x^pw)))` but it returned a vector for all cells. – dianaiii Apr 18 '21 at 14:41
2

Using base R

df$val <- with(df,  ave(val, ID, FUN = function(u)
              Reduce(function(x, y) x+ 2, u, accumulate = TRUE)))

Or using seq

library(dplyr)
df %>%
    group_by(ID) %>%
    mutate(val =  seq(first(val), length.out = n(), by = 2))
akrun
  • 874,273
  • 37
  • 540
  • 662
2

Below is one option using Reduce in data.table

> setDT(df)[, val := Reduce(function(x, y) (x + 2)^2, val, accumulate = TRUE), ID][]
   ID  val
1:  1    1
2:  1    9
3:  1  121
4:  2    4
5:  2   36
6:  3    6
7:  3   64
8:  3 4356
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81