0

I was reading a lot of stuff about pandas and date time slicing but I haven't found a solution for my problem yet. I hope you could give me some good advices! I have a data frame with a Datetimeindex and for example a single column with floats. The time series is about 60 years. For example:

idx = pd.Series(pd.date_range("2016-11-1", freq="M", periods=48))    
dft = pd.DataFrame(np.random.randn(48,1),columns=["NW"], index=idx)

enter image description here

I want to aggregate the column "NW" as sum() per month. I have to solve two problems.

  1. The year begins in November and ends in October.
  2. I have two periods per 12 months to analyse: a) from November to End of April in the following year and b) from May to End of October in the same year For example: "2019-11-1":"2020-4-30" and "2020-05-01":"2020-10-31"

I think I could write a function but I wonder if there is an easier way with methods from pandas to solve this problems.

Do you have any tips? Best regards Tommi.

Here are some additional informations:
The real datas are daily observations. I want to show a scatter plot for a time series with only the sum() for every month from November-April along the timeline (60 years til now). And the same for the values from May to October.

TommiWe
  • 1
  • 2
  • A dataframe in the attached image is already monthly aggregated. Perhaps do you want to aggregate the column "NW" as sum() per six months in accordance with the two periods you defined? – quasi-human Jan 30 '22 at 12:30
  • will this work `dft.resample('6M').sum()`? – Irfanuddin Jan 30 '22 at 12:31
  • @qausi-human: Your are correct. My example shows already values per month. Now I have to sum these for 6 months within both periods I described above. – TommiWe Jan 31 '22 at 16:34
  • @Ifranuddin: This will work but it is not my problem I have to solve. I could also write for example: "dft.loc["2019-11-01":"2020-04-30"]. But I have to do this for every year. And the same operation for the other period from May to October. – TommiWe Jan 31 '22 at 16:35

1 Answers1

0

this is my solution so far. Not the shortest way I think, but it works fine.

d_precipitation_winter = {}
 #for each year without the current year
 for year in dft.index.year.unique()[:-1]:
 #Definition start and end date to mark winter months
 start_date = date(year,11,1)
 end_date = date(year+1,4,30)
 dft_WH = dft.loc[start_date:end_date,:].sum()
 d_precipitation_winter[year]=dft_WH
df_precipitation_winter = pd.DataFrame(data=d_precipitation_winter)
TommiWe
  • 1
  • 2