How should I use CDO selyear? I get an output file four times larger

Question

CDO seems to work fine for me, until I met this. I have a netcdf of daily data from year 2101 to 2228, and I want to obtain a file with only years from 2101 to 2227, so I run:

cdo selyear,2101/2227 in.nc out.nc

But the output file is more than four times the input in memory size! It seems to have the right number of time steps, and the initial and end date are correct. Also, latitude and longitude seem to be the same as the input, so I wonder why the file size.

Perhaps the initial file had compression applied to it? cdo may have inflated the file in order to perform your operation. See for example https://www.unidata.ucar.edu/blogs/developer/entry/netcdf_compression — Robert Davy, Feb 28 '19 at 02:33
In addition to Rovert Davy: First have a look if `ncdump -k FILENAME` has the same output (it returns the type of NetCDF file). Then have a look via `ncdump -s -h FILENAME` at the attributes `_DeflateLevel` and `_Shuffle` of the variables in your files. A Deflate Level of 0 (or not such attribute at all) means no compression. If you use the command without `-s`, the relevant attributes are not printed. — daniel.heydebreck, May 14 '19 at 09:39

ClimateUnboxed · Answer 1 · 2022-08-03T13:42:59.797

5

Perhaps try to retain the compression with the cdo operator and output netcdf 4

cdo -f nc4c -z zip_9 selyear,2101/2227 in.nc out.nc

This is the maximum compression, usually I use zip_5 or so, as the files are not much larger than zip_9 and the command is much faster.

An alternative is to (re?)pack the data to SHORTS with add_offset and scale_factor like this:

cdo pack -selyear,2101/2227 in.nc out.nc

edited Aug 03 '22 at 13:42

answered Jan 14 '20 at 21:03

ClimateUnboxed

7,106
3
41
86

How should I use CDO selyear? I get an output file four times larger

1 Answers1