So what I am doing is downloading data from a data portal separated into variables, months and years (because it's faster to get the data this way). I have the data available on my drive and now want to stich it together. I did this successfully and now want to save the entire dataset ds
with all variables.
What I will describe in the following will happen to ALL variables!
In the preamble, I load
import xarray as xr
import os
import numpy as np
After stitching and everything, I look at the data and it looks reasonable
So I save the dataset with ds.to_netcdf('Data.nc')
If I reopen the data with xr.open_dataset('Data.nc')
again, the data is altered and does not go beyond certain values. I attached an image of this below.
Does anyone know, what is happening here and how to solve this?!?!?
P.S.: I am using Jupyter Notebook on macOs, if that is of importance?!
EDIT:
Output of ncdump -hs Data.nc
is:
netcdf Data {
dimensions:
time = 561024 ;
longitude = 3 ;
latitude = 3 ;
variables:
int64 time(time) ;
time:long_name = "time" ;
time:units = "hours since 1959-01-01 00:00:00" ;
time:calendar = "proleptic_gregorian" ;
time:_Storage = "contiguous" ;
time:_Endianness = "little" ;
float longitude(longitude) ;
longitude:_FillValue = NaNf ;
longitude:units = "degrees_east" ;
longitude:long_name = "longitude" ;
longitude:_Storage = "contiguous" ;
longitude:_Endianness = "little" ;
float latitude(latitude) ;
latitude:_FillValue = NaNf ;
latitude:units = "degrees_north" ;
latitude:long_name = "latitude" ;
latitude:_Storage = "contiguous" ;
latitude:_Endianness = "little" ;
short mpww(time, latitude, longitude) ;
mpww:_FillValue = -32767s ;
mpww:units = "s" ;
mpww:long_name = "Mean period of wind waves" ;
mpww:add_offset = 2.79603913709583 ;
mpww:scale_factor = 3.90353786451101e-05 ;
mpww:missing_value = -32767s ;
mpww:_Storage = "contiguous" ;
mpww:_Endianness = "little" ;
short shts(time, latitude, longitude) ;
shts:_FillValue = -32767s ;
shts:units = "m" ;
shts:long_name = "Significant height of total swell" ;
shts:add_offset = 1.18743369983622 ;
shts:scale_factor = 1.05300150544382e-05 ;
shts:missing_value = -32767s ;
shts:_Storage = "contiguous" ;
shts:_Endianness = "little" ;
short pp1d(time, latitude, longitude) ;
pp1d:_FillValue = -32767s ;
pp1d:units = "s" ;
pp1d:long_name = "Peak wave period" ;
pp1d:add_offset = 12.2260785916261 ;
pp1d:scale_factor = 0.000189618505657455 ;
pp1d:missing_value = -32767s ;
pp1d:_Storage = "contiguous" ;
pp1d:_Endianness = "little" ;
short hmax(time, latitude, longitude) ;
hmax:_FillValue = -32767s ;
hmax:units = "m" ;
hmax:long_name = "Maximum individual wave height" ;
hmax:add_offset = 2.23715532703722 ;
hmax:scale_factor = 1.92943508559216e-05 ;
hmax:missing_value = -32767s ;
hmax:_Storage = "contiguous" ;
hmax:_Endianness = "little" ;
short mpts(time, latitude, longitude) ;
mpts:_FillValue = -32767s ;
mpts:units = "s" ;
mpts:long_name = "Mean period of total swell" ;
mpts:add_offset = 8.83459024768542 ;
mpts:scale_factor = 7.28539333599922e-05 ;
mpts:missing_value = -32767s ;
mpts:_Storage = "contiguous" ;
mpts:_Endianness = "little" ;
short swh(time, latitude, longitude) ;
swh:_FillValue = -32767s ;
swh:units = "m" ;
swh:long_name = "Significant height of combined wind waves and swell" ;
swh:add_offset = 1.19637698437532 ;
swh:scale_factor = 1.04207642782417e-05 ;
swh:missing_value = -32767s ;
swh:_Storage = "contiguous" ;
swh:_Endianness = "little" ;
short shww(time, latitude, longitude) ;
shww:_FillValue = -32767s ;
shww:units = "m" ;
shww:long_name = "Significant height of wind waves" ;
shww:add_offset = 0.457024275937314 ;
shww:scale_factor = 1.394812537195e-05 ;
shww:missing_value = -32767s ;
shww:_Storage = "contiguous" ;
shww:_Endianness = "little" ;
// global attributes:
:Conventions = "CF-1.6" ;
:history = "2023-01-05 17:41:27 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data3/adaptor.mars.internal-1672940486.4022062-20190-2-1c3b422b-fb80-4c15-b960-bcf6e7f0c58a.nc /cache/tmp/1c3b422b-fb80-4c15-b960-bcf6e7f0c58a-adaptor.mars.internal-1672940458.202113-20190-2-tmp.grib" ;
:_NCProperties = "version=1|netcdflibversion=4.6.1|hdf5libversion=1.10.6" ;
:_SuperblockVersion = 0 ;
:_IsNetcdf4 = 1 ;
:_Format = "netCDF-4" ;
}
The output of ds.swh.encoding is different between the saved an loaded version:
Original
{'source': '/1959_1.nc',
'original_shape': (744, 3, 3),
'dtype': dtype('int16'),
'missing_value': -32767,
'_FillValue': -32767,
'scale_factor': 1.0420764278241716e-05,
'add_offset': 1.1963769843753225}
New Version
{'zlib': False,
'shuffle': False,
'complevel': 0,
'fletcher32': False,
'contiguous': True,
'chunksizes': None,
'source': '/Users/cgdavid/Documents/01-Forschung/01-Paper/Plate_Breakwater/New_Copernicus/Data.nc',
'original_shape': (561024, 3, 3),
'dtype': dtype('int16'),
'missing_value': -32767,
'_FillValue': -32767,
'scale_factor': 1.0420764278241716e-05,
'add_offset': 1.1963769843753225}
I have to say, that the original version only shows a small piece. So I downloaded Climate Data at Copernicus Climate Data Store in monthly bits for each variable. I then combine the months to a long time series via xr.concat([ds1,ds2],'time')
and then merge the variables via xr.merge([DS1,DS2])
...