Pandas resample aggregation, intra month periods?

Question

I'm trying to aggregate some data across periods using resampling and so far have something like this:

Data:

            val1  val2  val3  ...
date
2022-01-29  0.01  0.08  0.03
2022-01-30  0.04 -0.07  0.02
2022-01-31  0.09 -0.01  0.01
2022-02-01 -0.02  0.01  0.06
...

Code:

((df + 1).resample(frequency).transform(lambda x: x.cumprod()) - 1).resample(frequency).last()

This works exactly how I'd like for monthly or quarterly time series and the like but I'm trying to also get a rolling window from the start of the data set for example, 4 month periods from 2022/01/13: 2022/01/13-2022/05/12, 2022/05/13-2022/09/12, 2022/09/13-2023/01/12 etc I've tried using offsets or creating a custom DatetimeIndex but I either still just get period ends or an invalid frequency error. Is there any way to perform this type of aggregation from the data starting date?

SyntaxNavigator · Answer 1 · 2023-01-30T03:23:18.270

0

df = df.apply(pd.to_numeric, errors='coerce')
df = df.dropna()

start_date = df.index[0]
end_date = df.index[-1]
window_size = '4M'

df_shifted = df + 1
df_cumulative_product = df_shifted.resample(window_size).apply(lambda x: x.cumprod())
df_shifted_back = df_cumulative_product - 1
result = df_shifted_back.resample(window_size).last()

edited Jan 30 '23 at 03:23

answered Jan 30 '23 at 01:48

SyntaxNavigator

312
3
10

I'm afraid using a 30day in place of one month isn't accurate enough for this use case, I don't think calculating days per month could work either since each x month would have a different number of days. And I was also seeing a 'No numeric types to aggregate' error even though all columns except the index were float64. – user1267983 Jan 30 '23 at 02:46
Updated. Regarding the error, the code first converts all columns to numeric values and replaces non-numeric values with NaN. Then, it removes any rows that contain NaN values. – SyntaxNavigator Jan 30 '23 at 03:24
I think that's the same result I was seeing before unfortunately - resampling with any monthly frequency will use month ends for the periods. There are some other offset notations like month start of half-month but I've been unable to peg to the data start. – user1267983 Jan 30 '23 at 06:38

Pandas resample aggregation, intra month periods?

1 Answers1