3

I would like to use the rolling method of Pandas. I need a slight adjustment, however, I would like the 'value' placed at the top of the 'window'.

Currently, I am using this:

self.df['new_col'] = self.df['Zone3'].rolling(4).sum()

Which is producing this:

       Date     Time     Period  Zone4  Zone3  new_col
0   2018-02-23  00:00     900     11      2      NaN
1   2018-02-23  00:15     900     11      1      NaN
2   2018-02-23  00:30     900      7      3      NaN
3   2018-02-23  00:45     900      2      0      6.0
4   2018-02-23  01:00     900      3      2      6.0
5   2018-02-23  01:15     900      7      0      5.0
6   2018-02-23  01:30     900      2      4      6.0

What I would actually like:

       Date     Time     Period  Zone4  Zone3  new_col
0   2018-02-23  00:00     900     11      2      6.0
1   2018-02-23  00:15     900     11      1      6.0
2   2018-02-23  00:30     900      7      3      5.0
3   2018-02-23  00:45     900      2      0      6.0
4   2018-02-23  01:00     900      3      2      NaN
5   2018-02-23  01:15     900      7      0      NaN
6   2018-02-23  01:30     900      2      4      NaN

Notice the sum value is at the start of the window (position 1), and not at the end (position 4)

If rolling is the wrong method, great, any method would help. I know how to do this in a "pythonic" way (using a for loop), I was just hoping to do it in place with the dataframe using pandas.

Thank you in advance

cs95
  • 379,657
  • 97
  • 704
  • 746
dudeguy
  • 57
  • 5

1 Answers1

3

Use shift:

self.df['new_col'] = self.df['Zone3'].rolling(4).sum().shift(-3)

Or more general:

N = 4
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
         Date   Time  Period  Zone4  Zone3  new_col
0  2018-02-23  00:00     900     11      2      6.0
1  2018-02-23  00:15     900     11      1      6.0
2  2018-02-23  00:30     900      7      3      5.0
3  2018-02-23  00:45     900      2      0      6.0
4  2018-02-23  01:00     900      3      2      NaN
5  2018-02-23  01:15     900      7      0      NaN
6  2018-02-23  01:30     900      2      4      NaN

N = 2
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
         Date   Time  Period  Zone4  Zone3  new_col
0  2018-02-23  00:00     900     11      2      3.0
1  2018-02-23  00:15     900     11      1      4.0
2  2018-02-23  00:30     900      7      3      3.0
3  2018-02-23  00:45     900      2      0      2.0
4  2018-02-23  01:00     900      3      2      2.0
5  2018-02-23  01:15     900      7      0      4.0
6  2018-02-23  01:30     900      2      4      NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • This is great! Worked perfectly! Thank you. I thought about using shift, but I guess I didn't try hard enough. I though shift was for horizontal and not vertical. – dudeguy Feb 25 '18 at 17:25
  • I accepted the answer. I can't upvote it because I don't have enough reputation points – dudeguy Feb 25 '18 at 18:43