Merge Two Datasets with Different Number of Variables per Timestep Using CDO

Question

I am analyzing a time series of data that is split by time into two NetCDF files (infiles). These files have a different number of variables/fields, by design. Traditionally I have been using Climate Data Operators (CDO) to easily merge two datasets sorted by time using the following command in a terminal:

cdo mergetime <infiles> <outfile>

this command merges any number of files "infiles" sorted by time and writes a new "outfile" containing a time series of all the data in each ; however this doesn't appear to work by default with cdo, as it kicks back the following:

cdo    select (Abort): Input streams have different number of variables per timestep!

the statement is true, each file does have a different number of variables per timestep. But it prevents me from looking at the dataset as a whole. I have also tried the following modifications to the cdo command I use to merge the time series, without success:

cdo mergetime -select,name=<variable> <infiles> <outfile>
cdo -select,name=<variable> <infiles> <outfile>

I have read through the CDO Userguide and have not found any alternative solutions yet. I would be very grateful if anyone could offer a workaround for joining the two files into a single time-series of data (preferably in cdo but not necessarily) as I am running out of ideas.

I don't think this is possible with standard netcdf format, the grid definition is common to all variables... I don't think you can have N variables for some timesteps and M variables for others... The only way would be to try and set the fields to missing in the time slices where they are not available - is that really going to be helpful for you? Sounds to me like the soln you already have of working with one file per var is the best... — ClimateUnboxed, Apr 04 '22 at 22:08
copying new fields onto the prior dataset and assigning values as missing would actually be just as helpful. The only variables that require me to read off of the combined time series from both data sets are common to both files, but because one file has new variables I can't merge them with my known conventions like cdo. do you know how to do this? — Brandon, Apr 04 '22 at 22:27
Why don't you use delete to remove the new annoying variable? On phone now but will try to post an answer to this effect tomorrow. — ClimateUnboxed, Apr 04 '22 at 22:30
new variables are there by design, as previously stated. They are necessary for my data analysis. the option to copy the new variables to the file that doesn't have them and assign their values as missing (or anything really, doesn't matter) does seem ideal if it would let me merge them using the cdo mergetime convention — Brandon, Apr 04 '22 at 22:37
"copying new fields onto the prior dataset and assigning values as missing would actually be just as helpful". This is probably your only option. Merging operations are relatively strict in CDO to reduce to probability of accidental mistakes, so the likes of merge, mergtime and cat will not be able to handle your situation — Robert Wilson, Apr 05 '22 at 07:17
Is this a function that is available through cdo or would I have to use a different tool? I don't see this in the documentation either — Brandon, Apr 05 '22 at 21:31
I would create a new field from an existing one and then set everything to missing. Doing something like this `cdo -setrtomiss,0,-1000 -aexpr,"new=old-400"` — Robert Wilson, Apr 06 '22 at 08:19

score 1 · Answer 1 · answered Apr 04 '22 at 22:35

1

On phone but you could delete the extra annoying new variables from files with nco like this

ncks -x -v var1,var2 in.nc out.nc

And then merge as usual. I think you can use the cdo delete operator to do the same thing.

answered Apr 04 '22 at 22:35

ClimateUnboxed

7,106
3
41
86

Merge Two Datasets with Different Number of Variables per Timestep Using CDO

1 Answers1