I have daily time-series (named as fname (14245 rows × 15 columns) in my code) dataframe as shown in the image that I needed to count the number of days with value greater than 0. For example, number of days when ext2d_1_dates greater than 0 to get the needed monthly climatology information (January-December). In details, fname is the dataframe that contain daily data from 1st Jan 1982 to 31st Dec 2020. Firstly, I need to create a new vector having length of 365 for each of the needed 11 column out of the whole 15 columns (i.e., 365 rows x 11) while the first element will contain the average of for January 1st for 1982, 1983, 1984,...,2020. This is similar for the second element, third element and so on. Thereafter, I applied .groupby('month') to get monthly climatological value that will show the number of days when for example, ext2d_1_dates is greater than 0 for January-December for the whole data period (1st of January 1982 to 31st of December 2020). This will be used for histogram plot.
For the 10-days running mean, I think similar approach will be adopted that I've tried but I don't know how to proceed correctly to get 10-days running sum for January-December from 1982-2020.
So far, I've tried to achieved this using this codelines
csv_path_cont = 'path_to_my_file/file.csv'
filename = pd.read_csv(csv_path_cont)
fname_time = pd.to_datetime(filename['time'])
fname_month = [i.month for i in filename_time.to_list()]
fname['date'] = pd.to_datetime(fname['time'])
fname['month']=fname_month
fname.set_index('date')
fname['dayofyear'] = fname.index.day
for j in fname.columns[1:-3]:
event_occurrence = fname[[j,'month']][fname[j] > 0]
num_event = event_occurrence.groupby('month').count()
plt.figure()
plt.bar(num_event.index,num_event[j])
event_occurrence = fname[[j,'dayofyear']][fname[j]>0]
clim_event_occurrence = event_occurrence.groupby(event_occurrence.index.dayofyear).count()
dekad_clim_event_occurrence = clim_event_occurrence.rolling(10).sum()
plt.plot(dekad_clim_event_occurrence[j].values)
plt.title(j)
The plots am getting is not correct and I suspected that it's because the x-axis are not on the same length .
I want my expected results and plots to look like this .
In the expected results and plot, histogram will represent number of days when for example ext2d_1_dates values is greater than 0 while the grey area will represent 10-days running sum
NB: I'm using python 3..