Questions tagged [cumulative-sum]

For questions regarding implementations or algorithms for calculating cumulative sums (also known as running totals). Always add the tag for the language/platform!

A cumulative sum (also known as a running total or partial sum) refers to the concept of maintaining only a single value (the sum), which is updated each time a new value is added to the sequence.

1433 questions
7
votes
1 answer

Pyspark - Cumulative sum with reset condition

I have this dataframe +---+----+---+ | A| B| C| +---+----+---+ | 0|null| 1| | 1| 3.0| 0| | 2| 7.0| 0| | 3|null| 1| | 4| 4.0| 0| | 5| 3.0| 0| | 6|null| 1| | 7|null| 1| | 8|null| 1| | 9| 5.0| 0| | 10| 2.0| 0| | 11|null| …
Kafels
  • 3,864
  • 1
  • 15
  • 32
7
votes
3 answers

Get cumulative count per 2d array

I have general data, e.g. strings: np.random.seed(343) arr = np.sort(np.random.randint(5, size=(10, 10)), axis=1).astype(str) print (arr) [['0' '1' '1' '2' '2' '3' '3' '4' '4' '4'] ['1' '2' '2' '2' '3' '3' '3' '4' '4' '4'] ['0' '2' '2' '2' '2'…
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
7
votes
1 answer

Running Sums for Multiple Categories in MySQL

I have a table of the form Category Time Qty A 1 20 B 2 3 A 3 43 A 4 20 B 5 25 I need a running total to be calculated by…
Bogdan
  • 79
  • 1
  • 2
7
votes
3 answers

Compute running mean with tapered windows

Given a (dummy) vector index=log(seq(10,20,by=0.5)) I want to compute the running mean with centered window and with tapered windows at each end, i.e. that the first entry is left untouched, the second is the average of a window size of 3, and so…
ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
7
votes
2 answers

Selecting a subset of rows that exceed a percentage of total values

I have a table with customers, users and revenue similar to below (in reality thousands of records): Customer User Revenue 001 James 500 002 James 750 003 James 450 004 Sarah 100 005 Sarah 500 006 …
bendataclear
  • 3,802
  • 3
  • 32
  • 51
6
votes
3 answers

Cumulative sum in R by group and start over when sum of values in group larger than maximum value

The function below groups values in a vector based on whether the cumulative sum has reached a certain max value and then starts over. cs_group <- function(x, threshold) { cumsum <- 0 group <- 1 result <- numeric() for (i in 1:length(x))…
milan
  • 4,782
  • 2
  • 21
  • 39
6
votes
2 answers

Python: Calculate total return by adding cumulative dividends and taking the compound annual growth rate (CAGR)

I have a dataframe with annual price and dividend data for numerous companies. I am looking to calculate the 3-year annualized return by adding all of the dividends received during the three years to the ending stock price, and then taking the CAGR.…
6
votes
2 answers

Conditional running count (cumulative sum) with reset in R (dplyr)

I'm trying to calculate a running count (i.e., cumulative sum) that is conditional on other variables and that can reset for particular values on another variable. I'm working in R and would prefer a dplyr-based solution, if possible. I'd like to…
itpetersen
  • 1,475
  • 3
  • 13
  • 32
6
votes
1 answer

Running Total - Date difference

This is what the table looks like: create table IncomeTest (SubjectId int, Date_Value date, debit number, credit number); insert into IncomeTest values (1, '7-SEP-2017', 11000, 0); insert into IncomeTest values (1, '7-DEC-2017', 6000, 0); insert…
Veljko89
  • 1,813
  • 3
  • 28
  • 43
6
votes
2 answers

How to get accumulative maximum indices with numpy in Python?

I'm trying to get x and y coordinates of all cumulative maximums in a vector. I wrote a naive for-loop but I have a feeling there is some way to do it with numpy that's more elegant. Does anyone know of any functions, masking-technique, or…
O.rka
  • 29,847
  • 68
  • 194
  • 309
6
votes
1 answer

How to maintain a cumulative sum?

I have a sorteddict and I am interested in the cumulative sum of the values: >>> from blist import sorteddict >>> import numpy as np >>> x = sorteddict({1:1, 2:2, 5:5}) >>> zip(x.keys(), np.cumsum(x.values())) [(1, 1), (2, 3), (5, 8)] However, I…
mchen
  • 9,808
  • 17
  • 72
  • 125
6
votes
7 answers

Python Running Sum in List

Given the following list: a=[1,2,3] I'd like to generate a new list where each number is the sum of it and the values before it, like this: result = [1,3,6] Logic: 1 has no preceding value, so it stays the same. 3 is from the first value (1)…
Dance Party
  • 3,459
  • 10
  • 42
  • 67
6
votes
1 answer

How to calculate cumulative Total and % in DAX?

This might be very simple... I have the below summary table in Power BI and need to build a Pareto Chart, what I'm looking for is a way to create columns "D" and "E"... Thanks in advance! The Count from column "B" is a measure I've created in PBI…
Marcelo Aguilar
  • 67
  • 1
  • 2
  • 7
6
votes
4 answers

Sum of all rows prior to (and including) date on current row in MYSQL

It's important to know that the date will be unknown during the query time, so I cannot just hard code a 'WHERE' clause. Here's my table: +-----------+----------+-------------+ | Date_ID | Customer | Order_Count…
Danny W
  • 463
  • 2
  • 6
  • 12
6
votes
2 answers

3D variant for summed area table (SAT)

As per Wikipedia: A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. For a 2D space a summed area table can be generated by iterating x,y over the desired…
Ninja420
  • 3,542
  • 3
  • 22
  • 34