0

I have a dataframe like this,

tidx = pd.date_range('2022-10-01', periods=10, freq='10D') 
data_frame = pd.DataFrame(1, columns=['inventory'], index=tidx)
print(data_frame)

Output:

               inventory
2022-10-01          1
2022-10-11          1
2022-10-21          1
2022-10-31          1
2022-11-10          1
2022-11-20          1
2022-11-30          1
2022-12-10          1
2022-12-20          1
2022-12-30          1

I want to find the sum from 23rd to the 23rd of each month. I couldn't find a way to pass the day number to resample method. Any help is really appreciated.

Michael M.
  • 10,486
  • 9
  • 18
  • 34
Phoenix
  • 359
  • 7
  • 15
  • What code have you tried with resample so far? By "23rd to 23rd" do you mean like october 23rd to november 23rd, november 23rd to december 23rd, etc? – rassar Oct 26 '22 at 02:02
  • yes, like you said, for example October 23rd to november 23rd, november 23rd to december 23rd, etc – Phoenix Oct 26 '22 at 02:21

1 Answers1

1

Is this what you need?

import pandas as pd
from datetime import timedelta

tidx = pd.date_range('2022-10-01', periods=10, freq='10D') 
data_frame = pd.DataFrame(1, columns=['inventory'], index=tidx)


data_frame.index.name = "date"
data_frame = data_frame.reset_index()
data_frame["fin_year_month"] = ""
data_frame.loc[data_frame["date"].dt.day < 23, ["fin_year_month"]] = (data_frame["date"] - timedelta(days=25)).dt.year.astype("str") + "_" + (data_frame["date"] - timedelta(days=25)).dt.month.astype("str")
data_frame.loc[data_frame["date"].dt.day >= 23, ["fin_year_month"]] = (data_frame["date"]).dt.year.astype("str") + "_" + (data_frame["date"]).dt.month.astype("str")
data_frame.groupby("fin_year_month").sum()

Just be careful with the number of days you subtract. For the 23 to 23 I subtract 25 and this is fine. For 30 or 31 it would be a harder problem. Number of days to subtract will depend on a particular month and would be easier to write a function that would give a "previous year-month" given a particular date

Nikolay Zakirov
  • 1,505
  • 8
  • 17
  • your code breaks with, `"None of [Index(['fin_year_month'], dtype='object')] are in the [columns]"`. Am I missing something? – Phoenix Oct 26 '22 at 11:49
  • 1
    Hmm, works for me. Must be different versions of Pandas. Ok, I made an update that might solve it for you. Please try again – Nikolay Zakirov Oct 26 '22 at 11:57