1

I want to calculate the centered rolling average in Pandas, where the center value is excluded.

The code below works perfectly, but it includes the center value:

df.Value.rolling(window=11, center=True, min_periods=1).mean()

The window=11 and min_periods=1 are important, since I want to calculate the average for every value in df.

For example,

  • for the second value, it should consider 5 values below and 1 that is above the current value. (Current code includes the second value)
  • for the last value, it should consider 5 values above, but exclude the last one. (Current code considers the last 6 values)

Here is a table with the expected results:

Values Expected
13313 27180.6
12792 28897
20254 28770.14286
34915 27468.5
31410 29037
36532 30028.5
36958 34071.1
35471 36600.66667
33018 38950.625
38080 38804
44074 39037.33333
54165 37520.2

And calculation method from Excel:

Example Table

Crowford99
  • 13
  • 4

1 Answers1

1

updated answer

You can modify my first approach of the initial answer to have a dynamic division:

win = 11
roll = df['Values'].rolling(window=win, center=True, min_periods=1)

df['out'] = roll.sum().sub(df['Values']).div(roll.count()-1)

Output:

    Values     Expected           out
0    13313  27180.60000  27180.600000
1    12792  28897.00000  28897.000000
2    20254  28770.14286  28770.142857
3    34915  27468.50000  27468.500000
4    31410  29037.00000  29037.000000
5    36532  30028.50000  30028.500000
6    36958  34071.10000  34071.100000
7    35471  36600.66667  36600.666667
8    33018  38950.62500  38950.625000
9    38080  38804.00000  38804.000000
10   44074  39037.33333  39037.333333
11   54165  37520.20000  37520.200000

initial answer

You have different ways, whether or not the operation is really a mean:

np.random.seed(0)
df = pd.DataFrame({'Value': np.random.randint(0, 10, 10)})

win = 3

df['mean'] = df['Value'].rolling(window=win, center=True, min_periods=1).mean()

df['mean_without_center'] = (df['Value'].rolling(window=win, center=True, min_periods=1)
                             .sum().sub(df['Value']).div(win-1)
                            )

half = win//2
mask = [False]*half+[True]+[False]*half

df['mean_without_center_apply'] = (df['Value'].rolling(window=win, center=True, min_periods=1)
                             .apply(lambda s: s.mask(mask[:len(s)]).mean())
                            )

Output:

   Value      mean  mean_without_center  mean_without_center_apply
0      5  2.500000                  0.0                        5.0
1      0  2.666667                  4.0                        4.0
2      3  2.000000                  1.5                        1.5
3      3  4.333333                  5.0                        5.0
4      7  6.333333                  6.0                        6.0
5      9  6.333333                  5.0                        5.0
6      3  5.666667                  7.0                        7.0
7      5  3.333333                  2.5                        2.5
8      2  3.666667                  4.5                        4.5
9      4  3.000000                  1.0                        2.0

Or with 's sliding_window_view:

from numpy.lib.stride_tricks import sliding_window_view as swv

win = 3
half = win//2

df['swv_mean'] = np.nanmean(swv(np.pad(df['Value'].astype(float), (win-1, 0),
                                       constant_values=np.nan),
                                win)
                            [:, np.r_[:half, half+1:win]],
                            axis=1
                           )

Output:

   Value  swv_mean
0      5       5.0
1      0       0.0
2      3       4.0
3      3       1.5
4      7       5.0
5      9       6.0
6      3       5.0
7      5       7.0
8      2       2.5
9      4       4.5
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Thank you, but not exactly what I need. The first value should be the average of the next 5 values (if we go with win = 11). mean_without_center is close, but it works with win = 3, but not with win=11, since the div part is not dynamic. – Crowford99 May 13 '23 at 11:26
  • @Crowford99 please provide an explicit example. However if your average only takes into account the **next 5** for a window of 11, then it's rather a non centered shifted rolling of window 5... – mozway May 13 '23 at 11:58
  • You can find an example in the updated post. It considers the next 5 for the first value, but it considers the next 5 and the previous 5 for the tenth value. The way I constructed the rolling average code would be perfect if somehow I could exclude the current value. – Crowford99 May 13 '23 at 12:13
  • @Crowford99 can you provide your example in a reproducible format? I can't load data from an image – mozway May 13 '23 at 13:39
  • Added a table with examples. – Crowford99 May 13 '23 at 13:59
  • @Crowford99 thanks, I updated the answer to give your expected output – mozway May 13 '23 at 17:12