2

Good Morning,

I have the following dataframe:

a = [1,2,3,4,5,6]
b = pd.DataFrame({'a': a})   

I would like to create a column that sums the next "n" rows of column "a", including the present value of a; I tried:

n = 2
b["r"] = pd.rolling_sum(b.a, n) + a 
print(b)
   a     r
0  1   NaN
1  2   5.0
2  3   8.0
3  4  11.0
4  5  14.0
5  6  17.0

It would be delightful to have:

   a     r
0  1   1 + 2 + 3 = 6
1  2   2 + 3 + 4 = 9
2  3   3 + 4 + 5 = 12
3  4   4 + 5 + 6 = 15
4  5   5 + 6 + 0 = 11
5  6   6 + 0 + 0 = 6
cs95
  • 379,657
  • 97
  • 704
  • 746
Alessandro Ceccarelli
  • 1,775
  • 5
  • 21
  • 41
  • From pandas 1.1 you can easily do this with [`FixedForwardWindowIndexer`](https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.api.indexers.FixedForwardWindowIndexer.html#pandas.api.indexers.FixedForwardWindowIndexer). – cs95 Aug 03 '20 at 09:10

1 Answers1

2

pandas >= 1.1

Pandas now supports "forward-looking window operations", see here.

From 1.1, you can use FixedForwardWindowIndexer

idx = pd.api.indexers.FixedForwardWindowIndexer
b['a'].rolling(window=idx(window_size=3), min_periods=1).sum()

0     6.0
1     9.0
2    12.0
3    15.0
4    11.0
5     6.0
Name: a, dtype: float64

Note that this is still (at the time of writing) very buggy for datetime rolling operations - use with caution.


pandas <= 1.0.X

Without builtin support, you can get your output by first reversing your data, using rolling_sum with min_periods=1, and reverse again.

b.a[::-1].rolling(3, min_periods=1).sum()[::-1]

0     6.0
1     9.0
2    12.0
3    15.0
4    11.0
5     6.0
Name: a, dtype: float64
cs95
  • 379,657
  • 97
  • 704
  • 746