0

I am trying to access gridrad data. This data is stored on an OPeNDAP server, with one URL/file for each timestamp (example). Normally xarray is great for accessing these OPeNDAP grids, because you can slice up the giant grid to get only the section you want, before calling a .compute() to actually download it (or at least, I think that's how it works).

For example, I can download a subset of elevation data like this:

elevation_url  = 'https://pae-paha.pacioos.hawaii.edu/thredds/dodsC/srtm30plus_v11_land'
elevation_data = xr.open_dataarray(elevation_url)
elevation_data = elevation_data.sel(lon = slice(-105, -100), lat = slice(35, 45)).compute()

However, the gridrad dataset is stored in a sparse format, as seen below

url = 'https://rda.ucar.edu/thredds/dodsC/files/g/ds841.1/202006/nexrad_3d_v4_2_20200601T000000Z.nc'
radar_data = xr.open_dataset(url)
print(radar_data)

<xarray.Dataset>
Dimensions:         (Longitude: 2832, Latitude: 1248, Altitude: 28,
                     Sweep: 1902, Index: 4288002, time: 1)
Coordinates:
  * Longitude       (Longitude) float64 235.0 235.0 235.1 ... 293.9 294.0 294.0
  * Latitude        (Latitude) float64 24.01 24.03 24.05 ... 49.95 49.97 49.99
  * Altitude        (Altitude) float64 1.0 1.5 2.0 2.5 ... 19.0 20.0 21.0 22.0
  * time            (time) datetime64[ns] 2020-06-01
Dimensions without coordinates: Sweep, Index
Data variables:
    sweeps_merged   (Sweep) |S64 ...
    index           (Index) int32 ...
    Reflectivity    (Index) float32 ...
    wReflectivity   (Index) float32 ...
    SpectrumWidth   (Index) float32 ...
    wSpectrumWidth  (Index) float32 ...
    datehour        (time) int32 ...
    Nradobs         (Altitude, Latitude, Longitude) int8 ...
    Nradecho        (Altitude, Latitude, Longitude) int8 ...
...

Where "Index" gives the flattened index location of each variable in a Altitude x Latitude x Longitude array.

I can extract the data like this:

refl   = np.zeros(radar_data['Nradobs'].size)*np.nan
refl[radar_data['index']]   = radar_data['Reflectivity']
refl  = refl.reshape(radar_data['Nradobs'].shape)

But this requires downloading the entire 'Reflectivity' grid, which is way more than I actually need.

I can come up with a general method to get just the indices I want, but I'm not sure where to go after this:

min_lon = 240
max_lon = 241
min_lat = 37
max_lat = 38
min_alt = 1
max_alt = 2

alt_inds, lat_inds, lon_inds = np.unravel_index(radar_data['index'], radar_data['Nradobs'].shape)

good_lon_inds = radar_data['Longitude'][lon_inds].to_series().between(min_lon, max_lon).to_numpy()
good_lat_inds = radar_data['Latitude'][lat_inds].to_series().between(min_lat, max_lat).to_numpy()
good_alt_inds = radar_data['Altitude'][alt_inds].to_series().between(min_alt, max_alt).to_numpy()

good_inds = good_lon_inds & good_lat_inds & good_alt_inds

Is there a way to access only a small geographic subset of the whole grid, even though it is saved in this sparse format? Bonus points if you can come up with a way to get a temporal range (by accessing multiple URLs) as well.

hm8
  • 1,381
  • 3
  • 21
  • 41
  • are you able to download arbitrary subsets along index, and you're just looking for help structuring the requests in a clean way (e.g. more API design for your use case) or is the issue that you're unsure if xarray can make requests like this via OpenDap? If the former, it would be helpful if you could demonstrate what requesting aribitrary blocks along index looks like. – Michael Delgado Sep 13 '22 at 14:05
  • I guess I want a way to download arbitrary subsets along `Index`. I added a little chunk of code to the original post. – hm8 Sep 13 '22 at 15:34

0 Answers0