1

I'm creating a payments_summary database where I have a client ID in each row, and a month in each column.

The origin database has as columns [client_ID, date, amount] (a row for each payment received).

And I have another database where I have

using Pandas, I can group by month easily with pandas, and resample function by month, and I understand I could loop over the database so that each row is processed with a different date, but since my db is somehow big, I would rather use a vectorial function.

This, groups by month, but I want to group by month with different cut off date.

I tried

df.groupby(['colA']).resample('M').sum()

So, if client A cut off date is each 4th, I can have a matrix where each column has the sum of payments from to the 4th of May to the 4th of June, and like that for each month, not from the 1st to the last day to the month.

Lumos
  • 570
  • 1
  • 11
  • 24
  • for offset date equal to 4, how about `df.groupby(df['colA'].sub(pd.to_timedelta(3, unit='D')).resample('M')).sum()`? – Quang Hoang Jul 08 '19 at 19:42
  • I couldn't make that code work, however, [this](https://stackoverflow.com/questions/24304019/pandas-timeseries-resampling-ending-a-given-day) thread addresses the same issue. – Leslie Brenes Jul 16 '19 at 18:01

0 Answers0