2

I am working in R and have a dataframe with various series. I need to perform on most of these columns the following two operations:

  1. max(0,x_t - max(x_{t-1}, x_{t-2}, x_{t-3}, x_{t-4}))
  2. max(0,-1 + x_t / max(x_{t-1}, x_{t-2}, x_{t-3}, x_{t-4}))

I tried this solution for one column

df %>% mutate( pmax(0,x - pmax(lag(x), lag(x,2), lag(x,3), lag(x,4))) )

but I guess it’s possible to do this for all columns and for both operations using dplyr’s across and purrr syntax. Any ideas on how to do this?

Fef894
  • 59
  • 3

1 Answers1

1

You could use the across() function in the dplyr package.

#Define some test data
df <- data.frame( x= round(runif(100, 10, 15)), y=round(rnorm(100, 10, 5), 1))

#define the function to apply on each column
mypmax <- function(i){
    pmax(0,i - pmax(lag(i), lag(i,2), lag(i,3), lag(i,4)))
}

#apply the function on columns 1 & 2.
#create new column names to store the results.
df %>% mutate(across(c(1,2), mypmax, .names = "new_{.col}" ) )

         x    y new_x new_y
1   12  7.3    NA    NA
2   14 10.9    NA    NA
3   10 17.8    NA    NA
4   14 12.5    NA    NA
5   15 10.0     1   0.0
6   14 11.6     0   0.0
7   10  7.9     0   0.0
8   12  8.6     0   0.0
9   11 11.3     0   0.0
10  11  4.7     0   0.0
Dave2e
  • 22,192
  • 18
  • 42
  • 50