Calculating 30 year climate normal from gridded dataset in Python

Question

I am trying to calculate the 30 year temperature normal (1981-2010 average) for the NARR daily gridded data set linked below.

In the end for each grid point I want an array that contains 365 values, each of which contains the average temperature of that day calculated from the 30 years of data for that day. For example the first value in each grid point's array would be the average Jan 1 temperature calculated from the 30 years (1981-2010) of Jan 1 temperature data for that grid point. My end goal is to be able to use this new 30yrNormal array to calculate daily temperature anomalies from.

So far I have only been able to calculate anomalies from one year worth of data. The problem with this is that it is taking the difference between the daily temperature and the average for the whole year rather then the difference between the daily temperature and the 30 year average of that daily temperature:

file='air.sfc.2018.nc'
ncin = Dataset(file,'r')
#put data into numpy arrays
lons=ncin.variables['lon'][:]
lats=ncin.variables['lat'][:]
lats1=ncin.variables['lat'][:,0]
temp=ncin.variables['air'][:]
ncin.close()

AvgT=np.mean(temp[:,:,:],axis=0)
#compute anomalies by removing time-mean
T_anom=temp-AvgT

Data: ftp://ftp.cdc.noaa.gov/Datasets/NARR/Dailies/monolevel/ For the years 1981-2010

Try reading multiple one-year files in a dict with the year as a key and the corresponding `Dataset` as value. Then try to see whether you can figure out how to average. If you still need help, you need to tell more about the structure of `Dataset`, for example, how do you look up the temperature for a given day at a given lat/lon coordinate. — Han-Kwang Nienhuys, Jun 22 '20 at 16:08
The best way to analyze climate data is by using special operators like **CDO**, **NCO**, **NCL**, etc. I would encourage you to explore them before proceeding to python, but you could try exploring the `resample` function in `Xarray`. Also, try the `ydaymean` function in CDO to see if it answers your question. It basically calculates the multi-year daily mean. — Dani56, Jun 22 '20 at 18:06

Robert Wilson · Accepted Answer · 2022-02-10T14:47:01.223

2

This is most easily solved using CDO.

You can use my package, nctoolkit (https://nctoolkit.readthedocs.io/en/latest/ & https://pypi.org/project/nctoolkit/) if you are working with Python on Linux. This uses CDO as a backend.

Assuming that the 30 files are a list called ff_list. The code below should work.

First you would create the 30 year daily mean climatology.

import nctoolkit as nc
mean_30 = nc.open_data(ff_list)
mean_30.merge_time()
mean_30.drop(month=2,day=29)
mean_30.tmean("day")
mean_30.run()

Then you would subtract this from the daily figures to get the anomalies.

anom_30 = nc.open_data(ff_list)
anom_30.cdo_command("del29feb")
anom_30.subtract(mean_30)
anom_30.run()

This should have the anomalies

One issue is whether the files have leap years or how you want to handle leap years if they exists. CDO has an undocumented command -delfeb29, which I have used above

edited Feb 10 '22 at 14:47

answered Jun 23 '20 at 14:31

Robert Wilson

3,192
11
19

Thanks for your help! I am having trouble installing nctoolkit. When I try conda install nctoolkit I get the folllowing error message: PackagesNotFoundError: The following packages are not available from current channels: - nctoolkit Current channels: - https://repo.anaconda.com/pkgs/main/osx-64 - https://repo.anaconda.com/pkgs/main/noarch - https://repo.anaconda.com/pkgs/r/osx-64 - https://repo.anaconda.com/pkgs/r/noarch Is there another way I should try to install the package? – megamart27 Jul 13 '20 at 20:43
Conda installation is as follows: `conda install -c rwi nctoolkit` – Robert Wilson Jul 14 '20 at 07:55

Calculating 30 year climate normal from gridded dataset in Python

1 Answers1