0

I have a NetCDF file and trying to use xarray to read a variable (with no attributes and coordinates) then assign new attributes and coordinates to that. Here is my python code:

#--- Import packages:
import numpy as np
import xarray as xr

#--- Read in data:
fin  = xr.open_dataset("sample.nc")
tsk  = fin.wrfout_tsk_tavg
xlat = fin.latitude
xlon = fin.longitude

#--- Setting new attributes and coordinates:
tsk.attrs['units'] = 'K'
tsk.assign_coords(lat=xlat, lon=xlon)

In the NetCDF file, tsk, xlat, and xlon variables all have the same size (2d array). Now tsk has attributes, but still has no coordinates. Am I doing something wrong? There is no error.

Soroush
  • 3
  • 3
  • please always print your dataset when asking a question - if possible provide a full [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) so we know what's going on from the outset! – Michael Delgado Mar 18 '22 at 16:47

3 Answers3

1

assign_coords is not an in-place operation

From the xarray.DataArray.assign_coords docs:

Returns a new object with all the original data in addition to the new coordinates.

So you need:

tsk = tsk.assign_coords(lat=xlat, lon=xlon)
Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
1

This answer is a direct solution to your stated problem.

As an alternate approach to using assign_coords - it appears what you have is a dataset with dimensions without coordinates (lat, lon), and non-indexing coordinates xlat (lat) and xlon (lon) giving the actual lat/lon values. This is a really common situation when reading netCDF data.

I imagine your data looks something like this:

In [3]: ds = xr.Dataset(
   ...:     {'wrfout_tsk_tavg': (('lat', 'lon'), np.random.random((4, 8)))},
   ...:     coords={
   ...:         'xlat': (('lat', ), np.arange(23, 27)),
   ...:         'xlon': (('lon', ), np.arange(-110, -102)),
   ...:     },
   ...: )

In [4]: ds
Out[4]:
<xarray.Dataset>
Dimensions:          (lat: 4, lon: 8)
Coordinates:
    xlat             (lat) int64 23 24 25 26
    xlon             (lon) int64 -110 -109 -108 -107 -106 -105 -104 -103
Dimensions without coordinates: lat, lon
Data variables:
    wrfout_tsk_tavg  (lat, lon) float64 0.4214 0.5839 0.6675 ... 0.4333 0.4409

You can use xr.DataArray.swap_dims or xr.Dataset.swap_dims to switch between indexing and non-indexing coordinates with the same dimensionality. In your case:


In [5]: ds = ds.swap_dims({"lat": "xlat", "lon": "xlon"})
   ...: ds
Out[5]:
<xarray.Dataset>
Dimensions:          (xlat: 4, xlon: 8)
Coordinates:
  * xlat             (xlat) int64 23 24 25 26
  * xlon             (xlon) int64 -110 -109 -108 -107 -106 -105 -104 -103
Data variables:
    wrfout_tsk_tavg  (xlat, xlon) float64 0.4214 0.5839 0.6675 ... 0.4333 0.4409
Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
0

I think you should state that tsk.wrfout_tsk_tavg is a variable with coords lat and lon. Creating the dataset from scratch should do the trick:

ds = xr.Dataset(
    data_vars=dict(
        wrfout_tsk_tavg=["lon", "lat"], fin.wrfout_tsk_tavg)),
    coords=dict(lon=(["lon", "lat"], fin.longitude),
        lat=(["lon", "lat"], fin.latitude)))