0

In EXCEL, calculating a geomean of size 2 on Col1, would result in a 6 in row 1 of Geo_2 as the geomean of 4 and 9 is 6. In Pandas or numpy it appears to be the reverse, with a min_period = 1, the first row reflects the calculation of just 1 value and the subsequent calculations use the previous and current row of Col1 to calculate the geomean. I want the caluclation window to be the current and the next row of col1, so that the first value of Geo_2 is 6 and the last value should be 2.

    DASeries = [4,9,3,3,5,7,8,4,2]
import numpy as np
import pandas as pd
from scipy.stats.mstats import gmean
    DA_df = pd.DataFrame(DASeries)
    geoMA2 = [2,3]
    geo_df = pd.DataFrame([pd.Series(DASeries).rolling(window =elem, min_periods = 1).apply(gmean, raw =True) for elem in geoMA2]).T 
    Final = pd.concat([DA_df,geo_df],axis=1)
    labs = ['Col1','Geo_2','Geo_3']
    Final.columns = labs
    Final

enter image description here

cs95
  • 379,657
  • 97
  • 704
  • 746
rer49
  • 235
  • 4
  • 15

1 Answers1

1

Using .iloc[::-1]

pd.Series(DASeries).iloc[::-1].rolling(window =2, min_periods = 1).apply(gmean).iloc[::-1]
0    6.000000
1    5.196152
2    3.000000
3    3.872983
4    5.916080
5    7.483315
6    5.656854
7    2.828427
8    2.000000
dtype: float64
BENY
  • 317,841
  • 20
  • 164
  • 234