4

My objective is to do a cumulative sum of the elements of a vector and assign the result to each element. But when certain condition is reached, then reset the cumulative sum.

For example:

vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)

Now, suppose that the condition to reset the cumulative sum is that the next element has a different sign.

Then the desired output is:

vector_B <- c(1, 2, -1, -2, -3, 1, -1, -2, 1, -1)

How can I achieve this?

Maël
  • 45,206
  • 3
  • 29
  • 67
Joaquín L
  • 184
  • 14

4 Answers4

4

You can use a custom function instead of cumsum and accumulate results using e.g. purrr::accumulate:

library(purrr)
vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)

purrr::accumulate(vector_A, function(a,b) {
  if (sign(a) == sign(b))
    a+b
  else
    b
  })

[1]  1  2 -1 -2 -3  1 -1 -2  1 -1

or if you want to avoid any branch:

purrr::accumulate(vector_A, function(a,b) { b + a*(sign(a) == sign(b))})

[1]  1  2 -1 -2 -3  1 -1 -2  1 -1
Stefano Barbi
  • 2,978
  • 1
  • 12
  • 11
4

A base R option with Reduce

> Reduce(function(x, y) ifelse(x * y > 0, x + y, y), vector_A, accumulate = TRUE)
 [1]  1  2 -1 -2 -3  1 -1 -2  1 -1

or using ave + cumsum

> ave(vector_A, cumsum(c(1, diff(sign(vector_A)) != 0)), FUN = cumsum)
 [1]  1  2 -1 -2 -3  1 -1 -2  1 -1
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • thank you, other's answers also worked and were very good, but this one was the best for me. I'm going to make a small change in the declaration to make it more clear for me: `Reduce(function(x, y) ifelse(sign(x) == sign(y), x + y, y), vector_A, accumulate = TRUE)` – Joaquín L Feb 09 '22 at 09:30
  • 1
    @JoaquínLoncón yes, thanks. you absolutely can customize it with your recipes :) – ThomasIsCoding Feb 09 '22 at 09:32
4

Using ave:

ave(vector_A, data.table::rleid(sign(A)), FUN = cumsum)
#  [1]  1  2 -1 -2 -3  1 -1 -2  1 -1

A formula version of accumulate:

purrr::accumulate(vector_A, ~ ifelse(sign(.x) == sign(.y), .x + .y, .y))
#  [1]  1  2 -1 -2 -3  1 -1 -2  1 -1
Maël
  • 45,206
  • 3
  • 29
  • 67
  • 1
    Ah `data.table::rleid()` is nice -- exactly what I was looking for! And really the key to the general problem in this question I think. – Mikko Marttila Feb 09 '22 at 09:30
3

The approach that comes to mind is to find the runs (rle()) defined by the condition (sign()) in the data, apply cumsum() on each run separately (tapply()), and the concatenate back into a vector (unlist()). Something like this:

vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)

run_length <- rle(sign(vector_A))$lengths
run_id <- rep(seq_along(run_length), run_length)

unlist(tapply(vector_A, run_id, cumsum), use.names = FALSE)
#>  [1]  1  2 -1 -2 -3  1 -1 -2  1 -1

Wrapping the process up a bit, I’d maybe put finding the grouping factor (run index) in a function? And then the grouped summary will need to be done using existing tools, like tapply() above, or a creative ave(), or in the context of data frames, a group_by() and summarise() with dplyr.

run_index <- function(x) {
  with(rle(x), rep(seq_along(lengths), lengths))
}

ave(vector_A, run_index(sign(vector_A)), FUN = cumsum)
#>  [1]  1  2 -1 -2 -3  1 -1 -2  1 -1
Mikko Marttila
  • 10,972
  • 18
  • 31