0

I am trying to extract a single variable (DUEXTTAU) from multiple NC files, and then combine all the individual files into a single NC file. I am using nco, but have an issue with ncks.

The NC filenames follow:

MERRA2_100.tavgM_2d_aer_Nx.YYYYMM.nc4

Each file has 1 (monthly) time step, and the time coordinate has no real value, but changes in units or begin_date. For example, in the file MERRA2_100.tavgM_2d_aer_Nx.198001.nc4, it has:

    int time(time=1);
      :long_name = "time";
      :units = "minutes since 1980-01-01 00:30:00";
      :time_increment = 60000; // int
      :begin_date = 19800101; // int
      :begin_time = 3000; // int
      :vmax = 9.9999999E14f; // float
      :vmin = -9.9999999E14f; // float
      :valid_range = -9.9999999E14f, 9.9999999E14f; // float
      :_ChunkSizes = 1U; // uint

I repeat this step for each file

ncks -v DUEXTTAU MERRA2_100.tavgM_2d_aer_Nx.YYYYMM.nc4 YYYYMM.nc4

and then

ncrcat YYYYMM.nc4 final.nc4

In final.nc4, the time coordinate has the same value (of the first YYYYMM.nc4). For example, after combining the 3 files of 198001, 198002 and 198003, the time coordinate equals 198001 for all the time steps. How should I deal with this?

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
peteron30
  • 69
  • 7

2 Answers2

1

Firstly, this command should work:

ncrcat -v DUEXTTAU MERRA2_100.tavgM_2d_aer_Nx.??????.nc4 final.nc4

However, recent versions of NCO fail to correctly reconstruct or re-base the time coordinate when time is an integer, which it is in your case. The fix is in the latest NCO snapshot on GitHub and will be in 4.9.3 to be released hopefully this week. If installing from source is not an option, then manual intervention would be required (e.g., change time to floating point in each input file with ncap2 -s 'time=float(time)' in.nc out.nc). In any case, the time_increment, begin_date, and begin_time attributes are non-standard and will simply be copied from the first file. But time itself should be correctly reconstructed if you use a non-broken version of ncrcat.

Charlie Zender
  • 5,929
  • 14
  • 19
  • Thanks @charlie-zender. The command worked, but the issue still exist for the time coordinate. I may have misunderstood the problem, but when I opened final.nc4 using ```xarray```, and print out the time coordinate, I see this: array(['1980-01-01T00:30:00.000000000', '1980-01-01T00:30:00.000000000', '1980-01-01T00:30:00.000000000', '1980-01-01T00:30:00.000000000',... – peteron30 May 19 '20 at 13:04
  • Is the information in the units metadata set correctly though? What do you get if you try cdo showtime and cdo showdata (or alternatively cdo info ) on the original files for 198001, 198002 and 198003... ? If this is correct, I think cdo can handle the differing "base" times, but I thought that was also the case for nco. – ClimateUnboxed May 20 '20 at 09:12
0

you can do this using cdo as well, but you need two steps:

cdo mergetime MERRA2_100.tavgM_2d_aer_Nx.??????.nc4 merged_file.nc 
cdo selvar,DUEXTTAU merged_file.nc DUEXTTAU.nc 

This should actually work if the begin dates are all set correctly. The problem is that merged_file.nc could actually be massive, and so it may be better to loop through to extract the variable first and then combine:

for file in `ls MERRA2_100.tavgM_2d_aer_Nx.??????.nc4`; do 
    cdo selvar,DUEXTTAU $file ${file#????}_duexttau.nc4 
done 
cdo mergetime MERRA2_100.tavgM_2d_aer_Nx.??????_duexttau.nc4 DUEXTTAU.nc 
rm -f MERRA2_100.tavgM_2d_aer_Nx.??????_duexttau.nc4 # clean up
ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86