Loop over netCDF datetime format and calculate mean based on month

Question

I have a dataset (a netCDF4 input_file) with the dimensions (504, 720, 500) where the first is a datetime value:

0     1979-01-15
1     1979-02-15
2     1979-03-15
3     1979-04-15
4     1979-05-15
         ...    
499   2020-08-15
500   2020-09-15
501   2020-10-15
502   2020-11-15
503   2020-12-15
Length: 504, dtype: datetime64[ns]

There is a variable with values I want to average per month. So ultimately I would like 12 values with the average of the variable based on the month in the first dimension.

I tried looping over it like such:

# empty dataframe
df = pd.DataFrame(columns = ['Month', 'Value'])

for i in range(size(df['time'])):
    month = input_file['time'][i].month # get the current month
    avg = np.average(input_file['values'][i, :, :]) # average for the month of that year

    # append to df
    df = df.append(pd.DataFrame({'Month' : month,
                                 'Value' : avg})

But up until here I am a bit lost, this doesn't work (invalid syntax) and I would still need to loop over the values again to get the average for each month seperately.

score 1 · Accepted Answer · answered Jan 17 '22 at 17:08

1

Assuming the 2nd and 3rd dimensions are lat and lon, it seems what you are trying to do is just:

input_file.mean(dim = ['lat', 'lon'])

Then you can convert to a dataframe with .to_dataframe()

answered Jan 17 '22 at 17:08

Thrasy

536
3
9

This doesn't work sadly, "netcdf: attribute not found". – B.Quaink Jan 18 '22 at 13:44
Without access to the data, I am afraid I can't help. – Thrasy Jan 18 '22 at 20:38

score 0 · Answer 2 · edited Mar 08 '22 at 18:48

0

I'm not sure if this is what you need

xr.open_dataset('file.nc')
xr.resample(time ='M').mean()

edited Mar 08 '22 at 18:48

aaossa

3,763
2
21
34

answered Mar 03 '22 at 17:25

ERIKA PARDO

1

Loop over netCDF datetime format and calculate mean based on month

2 Answers2