8

I'm trying to create a cf compliant netcdf file. I can get it about 98% cf compliant with xarray but there is one issue that I am running into. When I do an ncdump on the file that I am creating, I see the following:

float lon(lon) ;
    lon:_FillValue = NaNf ;
    lon:long_name = "Longitude" ;
    lon:standard_name = "longitude" ;
    lon:short_name = "lon" ;
    lon:units = "degrees_east" ;
    lon:axis = "X" ;
    lon:valid_min = -180.f ;
    lon:valid_max = 180.f ;
float lat(lat) ;
    lat:_FillValue = NaNf ;
    lat:long_name = "Latitude" ;
    lat:standard_name = "latitude" ;
    lat:short_name = "lat" ;
    lat:units = "degrees_north" ;
    lat:axis = "Y" ;
    lat:valid_min = -90.f ;
    lat:valid_max = 90.f ;
double time(time) ;
    time:_FillValue = NaN ;
    time:standard_name = "time" ;
    time:units = "days since 2006-01-01" ;
    time:calendar = "gregorian" ;

The coordinates for my dataset are lat, lon, and time. When I convert to netcdf via ds.to_netcdf(), all coordinate variables have fill values applied automatically because they are floats. Having a coordinate variable with a fill value applied violates cf standards (http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#attribute-appendix).

I tried to change the encoding so these specific variables are not compressed:

import numpy as np
import xarray as xr
import pandas as pd
import datetime as dt

lons = np.arange(-75, -70, .5).astype(np.float32)
lats = np.arange(40,42, .25).astype(np.float32)
[x, y] = np.meshgrid(lons, lats)
u = np.random.randn(1, 8, 10).astype(np.float32)
v = np.random.randn(1, 8, 10).astype(np.float32)
time_index = pd.date_range(dt.datetime.now(), periods=1)

ds = xr.Dataset()
coords = ('time', 'lat', 'lon')
ds['u'] = (coords, np.float32(u))
ds['v'] = (coords, np.float32(v))
ds.coords['lon'] = lons
ds.coords['lat'] = lats
ds.coords['time'] = time_index

encoding = {'lat': {'zlib': False},
            'lon': {'zlib': False},
            'u': {'_FillValue': -999.0,
                  'chunksizes': (1, 8, 10),
                  'complevel': 1,
                  'zlib': True}
            }
ds.to_netcdf('test.nc', encoding=encoding)

or by changing dtypes, but I'm not having any luck. I'd prefer not to reload the files using netCDF4 to remove the _FillValues. Is there a way around this that is built into xarray?

Bart
  • 9,825
  • 5
  • 47
  • 73
naja
  • 361
  • 1
  • 3
  • 10
  • Interesting question, but as always, providing a [minimal working example](https://stackoverflow.com/help/mcve) makes it a lot easier for others to look into this problem. – Bart Aug 15 '17 at 13:34
  • Apologies. I've added a test example. – naja Aug 15 '17 at 14:01

1 Answers1

14

update 2022: In newer versions of xarray, '_FillValue': False should be replaced with '_FillValue': None. Thanks @Biggsy for pointing this out in the comments below.


Adding _FillValue: False to the lat/lon encoding seems to work:

encoding = {'lat': {'zlib': False, '_FillValue': False},
            'lon': {'zlib': False, '_FillValue': False},
            'u': {'_FillValue': -999.0,
                  'chunksizes': (1, 8, 10),
                  'complevel': 1,
                  'zlib': True}
            }

ncdump -h of the resulting file:

netcdf test {
dimensions:
    time = 1 ;
    lat = 8 ;
    lon = 10 ;
variables:
    float u(time, lat, lon) ;
        u:_FillValue = -999.f ;
    float v(time, lat, lon) ;
        v:_FillValue = NaNf ;
    float lon(lon) ;
    float lat(lat) ;
    int64 time(time) ;
        string time:units = "days since 2017-08-15 17:41:19.460662" ;
        string time:calendar = "proleptic_gregorian" ;
}
Bart
  • 9,825
  • 5
  • 47
  • 73
  • Doh! It's always something so simple. This worked. I previously tried _FillValue=None and for some reason, it didn't occur to me to use False! Thank you so much! – naja Aug 15 '17 at 15:54
  • I couldn't find documentation on this, so it was also just guessing for me. `False` was the second attempt, after `None`... If the solution works you can accept the answer to get it of the list of open questions. – Bart Aug 15 '17 at 16:30
  • 2
    Looks like this has changed now (2022) from False to None – Biggsy Mar 30 '22 at 13:16
  • 1
    Thanks @Biggsy, I added the update to the answer. – Bart Mar 31 '22 at 10:23