0

I'm trying to extract some climate data from a netcdf file, but I'm getting an error saying the dimensions do not exist. I've sliced netcdf files before from different model and never run into this error.

The output for the dataset looks like this:

[<xarray.Dataset>
 Dimensions:    (bnds: 2, time: 60, x: 253, y: 167)
 Coordinates:
   * time       (time) object 2021-01-16 12:00:00 ... 2025-12-16 12:00:00
     lon        (y, x) float64 ...
     lat        (y, x) float64 ...
   * x          (x) float64 -6.3e+06 -6.25e+06 -6.2e+06 ... 6.25e+06 6.3e+06
   * y          (y) float64 -4.15e+06 -4.1e+06 -4.05e+06 ... 4.1e+06 4.15e+06
     height     float64 ...
 Dimensions without coordinates: bnds
 Data variables:
     time_bnds  (time, bnds) object ...

I'm trying to slice the data with this code snippet, but it's giving me a dimension error.

In[4]:  r[0]['tasmax'].sel(lon='15.74', lat='24.36', time='2021-01-16', method='nearest').data[0]

Out[4]:

ValueError                                Traceback (most recent call last)
<ipython-input-11-16ed606949f0> in <module>
----> 1 climateModels['CSIRO-QCCCE-CSIRO-Mk3-6-0']['RegCM4-4']['RCP 45'][0]['tasmax'].sel(
      2                                 lon='15.74', lat='24.36', time='2021-01-16', method='nearest').data[0]

C:\ProgramData\Anaconda3\lib\site-packages\xarray\core\dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1141 
   1142         """
-> 1143         ds = self._to_temp_dataset().sel(
   1144             indexers=indexers,
   1145             drop=drop,

C:\ProgramData\Anaconda3\lib\site-packages\xarray\core\dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   2103         """
   2104         indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 2105         pos_indexers, new_indexes = remap_label_indexers(
   2106             self, indexers=indexers, method=method, tolerance=tolerance
   2107         )

C:\ProgramData\Anaconda3\lib\site-packages\xarray\core\coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
    395     }
    396 
--> 397     pos_indexers, new_indexes = indexing.remap_label_indexers(
    398         obj, v_indexers, method=method, tolerance=tolerance
    399     )

C:\ProgramData\Anaconda3\lib\site-packages\xarray\core\indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
    257     new_indexes = {}
    258 
--> 259     dim_indexers = get_dim_indexers(data_obj, indexers)
    260     for dim, label in dim_indexers.items():
    261         try:

C:\ProgramData\Anaconda3\lib\site-packages\xarray\core\indexing.py in get_dim_indexers(data_obj, indexers)
    223     ]
    224     if invalid:
--> 225         raise ValueError(f"dimensions or multi-index levels {invalid!r} do not exist")
    226 
    227     level_indexers = defaultdict(dict)

ValueError: dimensions or multi-index levels ['lon', 'lat'] do not exist

I know .where can be used here but I'm trying to find the nearest coordinates to the ones I have so can't use .where. Does anyone know what I can do to resolve this issue?

Hamza Waheed
  • 139
  • 2
  • 10

1 Answers1

0

You are performing the sel operation on lon and lat, which are not dimensions, only 2D coordinates, thus the sel method cannot work. However, we can see that x and y are 1D coordinates with names equals to their dimensions : why don't you extract the desired data with a x0 and y0 corresponding to lon=15.74 and lat=24.36 projected in the right CRS ?

Edit :

I really believe you should look into these projection / regular grids stuff. Basically, everything is easy when you deal with regular grids (i.e. with 1D coordinates). Anyway, if you want to extract your data using lon and lat, you could use :

import numpy as np
from sklearn.metrics import pairwise_distances_argmin


def extract(ds, lon, lat):
    index = pairwise_distances_argmin(X=[[lon, lat]], Y=np.c_[ds['lon'].values.ravel(), ds['lat'].values.ravel()])
    i0, j0 = np.unravel_index(index, (ds['lon'].shape))
    return ds.isel(x=j0, y=i0).squeeze()

extract(ds, 15.74, 24.36)
cyril
  • 524
  • 2
  • 8
  • How do I go on about converting my coordinates to the x and y values? I'm only using these sets for an analysis, not much clue about GIS. – Hamza Waheed Mar 02 '22 at 13:04
  • Is there anything in `ds.attrs` or in the source website that indicates the CRS or EPSG in which `x` and `y` are ? – cyril Mar 02 '22 at 13:39
  • Brother, I've no idea what any of that means. Can't I use a similar method to what you employed to convert those coordinates? I can write a small script to do it. – Hamza Waheed Mar 02 '22 at 13:53
  • See my edit, i've added a quick way to deal with your problem. Still, I think it would have been better to take advantage of the `x` and `y` dimensions/coordinates. – cyril Mar 02 '22 at 14:12
  • And sorry for taking your time again, but is there a way I could just get the conversion to regular coordinates. I can then use them in the crude code I already have. – Hamza Waheed Mar 02 '22 at 14:24
  • 1
    You are interpolating a non-rectangular grid (curvilinear etc.) to a single location. You probably need to use something like xesm to do this. Look into it: https://xesmf.readthedocs.io/en/latest/ – Robert Wilson Mar 02 '22 at 14:43