Questions tagged [cumsum]

Cumsum is a MatLab, NumPy, Pandas and R function that returns the cumulative sum along different dimensions of an array.

Cumsum is a MatLab, NumPy, Pandas and R function that returns the cumulative sum along different dimensions of an array.

799 questions
5
votes
2 answers

vectorize cumsum by factor in R

I am trying to create a column in a very large data frame (~ 2.2 million rows) that calculates the cumulative sum of 1's for each factor level, and resets when a new factor level is reached. Below is some basic data that resembles my own. itemcode…
jvalenti
  • 604
  • 1
  • 9
  • 31
5
votes
1 answer

cumulative sum over time

I have this data: thedat <- structure(list(id = c(" w12", " w12", " w12", " w11", " w3", " w3", " w12", " w45", " w24", " w24", " w24", " w08", " …
user1322296
  • 566
  • 2
  • 7
  • 26
5
votes
3 answers

Apply in R: recursive function that operates on its own previous result

How do I apply a function that can "see" the preceding result when operating by rows? This comes up a lot, but my current problem requires a running total by student that resets if the total doesn't get to 5. Example Data: > df row Student Absent…
Sam
  • 125
  • 4
  • 7
4
votes
3 answers

lag and cumulative sum by group

For a matrix with three columns . ID t res 1 1 -1.5 1 2 -1.5 1 3 0.5 1 4 0.5 2 1 -0.5 2 2 -0.5 2 3 -2.0 2 4 -1.5 2 …
4
votes
4 answers

How to cumsum the elements of a vector under certain condition in R?

My objective is to do a cumulative sum of the elements of a vector and assign the result to each element. But when certain condition is reached, then reset the cumulative sum. For example: vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1) Now,…
Joaquín L
  • 184
  • 14
4
votes
2 answers

Conditional cumulative sum from two columns

I can't get my head around the following problem. Assuming the follwoing data: library(tidyverse) df <- tibble(source = c("A", "A", "B", "B", "B", "C"), value = c(5, 10, NA, NA, NA, 20), add = c(1, 1, 1, 2, 3, 4)) What…
deschen
  • 10,012
  • 3
  • 27
  • 50
4
votes
5 answers

Numpy array counter with reset

I have a numpy array with only -1, 1 and 0, like this: np.array([1,1,-1,-1,0,-1,1]) I would like a new array that counts the -1 encountered. The counter must reset when a 0 appears and remain the same when it's a 1: Desired…
4
votes
1 answer

Subset by selecting only increasing values in previous rows greater than a certain number

What I am trying to do seems simple- but after 2 days of searching I have decided to post my first question here to see if anyone can help. I have a dataframe(df) of 5 variables and 250,000 rows. Sample: date.time Lat Lon …
Meg.abytes
  • 169
  • 8
4
votes
4 answers

Use of cumsum() iterativley in one column

Is it possible to use cumsum() iteratively in one column with start - stop conditional on other column: given the dataframe df with one column X where values are ascending. cumsum() should stop when reaching 10 or a multiple of ten…
TarJae
  • 72,363
  • 6
  • 19
  • 66
4
votes
3 answers

Use a single, common group-specific baseline for calculations (cumsum) within sub-groups

I'm looking for a tidy solution preferably using tidyverse This question is in line with this answer, it does however have an added twist. My data has an overall grouping variable 'grp'. Within each such group, I want to perform calculations based…
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
4
votes
2 answers

How to create offsets from start in pandas, given length of segments and offsets in segment?

The title may not be the most informative. I have the following working code I want to vectorize [no for loops] using native pandas. Basically, it should return for each row its cumulative offset from 0, given the length of each segment, and a…
Gulzar
  • 23,452
  • 27
  • 113
  • 201
4
votes
1 answer

How to group based on cumulative sum that resets on a condition

I have a pandas df with word counts corresponding to articles. I want to be able to be able to add another column MERGED that is based on groups of articles that have a minimum cumulative sum of 'min_words'. df = pd.DataFrame([[ 0, 6], [ …
ginobimura
  • 115
  • 1
  • 5
4
votes
3 answers

CumSum dataframe rows if value is on another dataframe

I have 2 dataframes in my jupyter-notebook. In the first one I have a series with lists of words and in the second I have a series with words. I need to iterate over each list of words from the first dataframe, to check if the word is on the other…
4
votes
1 answer

Why is dplyr::cummean(x) not equal to cumsum(x)/seq_along(x)?

Why is cummean(x) not equal to cumsum(x)/seq_along(x)? set.seed(456) x <- as.integer(runif(30)*300) x cummean(x) cumsum(x)/seq_along(x) [1] 26 63 219 255 236 99 24 85 71 115 111 65 226 246 179 195 252 135 215 87 53 216 271 133 251 211 285…
Mervyn Lau
  • 43
  • 3
4
votes
4 answers

Percentage of events before and after a sequence of zeros in pandas rows

I have a dataframe like the following: ID 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 90 total ----------------------------------------------------------------------------------------------------- 0 …
RafaJM
  • 481
  • 6
  • 17