1

I have a netcdf file with an hourly timestep, and with some values missing. What I want to do is, for each gridcell, set all hourly step to missing if any hourly step of that particular day is missing. Preferably in cdo or nco if possible.

I came up with the following cdo based solution, which makes a mask using a gec logical and then using the dayavg command to get a daily mask of 1 if all timesteps are okay, or miss if any of them are missing. I can then use mul to wipe all hourly slices where any values are missing, like this:

# low threshold in gec to ensure output is 1 for any data value
cdo dayavg -gec,-1e32 in.nc mask.nc # use avg NOT mean
cdo mul in.nc mask.nc out.nc 

This works, but:

  • It only works for a single day of data (there is not yet a daymul function in cdo, pity) so you need to extract data one day at a time in a loop, extremely slow and clunky
  • it feels like a big fudge using the gec logical with some very low threshold

so I wondering if I'm missing a neater solution in cdo or a more concise way of doing this in nco?

EDIT 01/2021: I thought there was a daymul function in cdo so at least I could use the above method on multi-day timeseries, but surprisingly there wasn't. However, Uwe very kindly responded to my request to add daymul and daydiv functions into cdo, which will be available from version v2.0.3 in January 2021.

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86

1 Answers1

3

My guess is that this is as concise as you will get in CDO.

However, I find a less "fudgy" way of doing this is using the eq operator. This is generally a neater (and I think faster) way to create a mask file where 1 is non-NA and NA is missing. So your second line becomes:

cdo dayavg -eq in.nc in.nc mask.nc
Robert Wilson
  • 3,192
  • 11
  • 19
  • Thanks Robert. Can I ask, is this function safe with a floating point field? I always worry using eq on floats in case it fails due to some kind of rounding issue – ClimateUnboxed Dec 21 '21 at 10:46
  • It's never caused me problems, but potentially it's not 100% robust with all data types. – Robert Wilson Dec 21 '21 at 11:59
  • 1
    I suppose comparing a field with itself should be the most safe usage of eq on floats, if that weren't bit reproducible it would be concerning... – ClimateUnboxed Dec 21 '21 at 17:06