0

I have daily time-series (named as fname (14245 rows × 15 columns) in my code) dataframe as shown in the image data_description that I needed to count the number of days with value greater than 0. For example, number of days when ext2d_1_dates greater than 0 to get the needed monthly climatology information (January-December). In details, fname is the dataframe that contain daily data from 1st Jan 1982 to 31st Dec 2020. Firstly, I need to create a new vector having length of 365 for each of the needed 11 column out of the whole 15 columns (i.e., 365 rows x 11) while the first element will contain the average of for January 1st for 1982, 1983, 1984,...,2020. This is similar for the second element, third element and so on. Thereafter, I applied .groupby('month') to get monthly climatological value that will show the number of days when for example, ext2d_1_dates is greater than 0 for January-December for the whole data period (1st of January 1982 to 31st of December 2020). This will be used for histogram plot.

For the 10-days running mean, I think similar approach will be adopted that I've tried but I don't know how to proceed correctly to get 10-days running sum for January-December from 1982-2020.

So far, I've tried to achieved this using this codelines

    csv_path_cont = 'path_to_my_file/file.csv'
    filename = pd.read_csv(csv_path_cont)

    fname_time = pd.to_datetime(filename['time'])
    fname_month = [i.month for i in filename_time.to_list()]
    fname['date'] = pd.to_datetime(fname['time'])
    fname['month']=fname_month
    fname.set_index('date')
    fname['dayofyear'] = fname.index.day

    for j in fname.columns[1:-3]:
        event_occurrence = fname[[j,'month']][fname[j] > 0]
        num_event = event_occurrence.groupby('month').count()
        plt.figure()
        plt.bar(num_event.index,num_event[j])
    
        event_occurrence = fname[[j,'dayofyear']][fname[j]>0]
        clim_event_occurrence = event_occurrence.groupby(event_occurrence.index.dayofyear).count()
        dekad_clim_event_occurrence = clim_event_occurrence.rolling(10).sum()
        plt.plot(dekad_clim_event_occurrence[j].values)
        plt.title(j)

The plots am getting is not correct and I suspected that it's because the x-axis are not on the same length sample_of_my_results_and_plot.

I want my expected results and plots to look like this expected_result_and_plot.

In the expected results and plot, histogram will represent number of days when for example ext2d_1_dates values is greater than 0 while the grey area will represent 10-days running sum

NB: I'm using python 3..

0 Answers0