There are multiple ways you can achieve this. I'll assume history_data_resampled
has a day frequency, and that you're using pandas
:
import pandas as pd
import numpy as np
# == Generating Random DataFrame ===============================
dates = pd.period_range('2021-01-01', '2022-12-31', freq='d')
random_values = np.random.randint(0, 120, dates.shape[0])
df = pd.DataFrame({'date': dates, 'value': random_values})
# == Moving average =============================================
# Generating a monthly average, with a period of 12 months, and 1 month of lag:
df.set_index('date').resample("1M")['value'].sum().rolling(12).mean()
# Outputs:
"""
date
2021-01 NaN
2021-02 NaN
2021-03 NaN
2021-04 NaN
2021-05 NaN
2021-06 NaN
2021-07 NaN
2021-08 NaN
2021-09 NaN
2021-10 NaN
2021-11 NaN
2021-12 1644.750000
2022-01 1648.416667
2022-02 1663.583333
2022-03 1696.166667
2022-04 1747.666667
2022-05 1766.750000
2022-06 1783.166667
2022-07 1825.083333
2022-08 1814.166667
2022-09 1792.250000
2022-10 1809.833333
2022-11 1789.166667
2022-12 1798.166667
Freq: M, Name: value, dtype: float64
"""
In the above example, we've resampled the data to a monthly frequency, summing all values from the same month, and then for each month, calculated the average of the previous 12 months. We're using .resample("1M")['value'].sum()
to sum all observations that fall in a specific month together.
Note that before '2021-11'
we couldn't calculate any value, since our data would require us to have values from '2020'
in our dataset.