I am trying to access gridrad data. This data is stored on an OPeNDAP server, with one URL/file for each timestamp (example). Normally xarray is great for accessing these OPeNDAP grids, because you can slice up the giant grid to get only the section you want, before calling a .compute()
to actually download it (or at least, I think that's how it works).
For example, I can download a subset of elevation data like this:
elevation_url = 'https://pae-paha.pacioos.hawaii.edu/thredds/dodsC/srtm30plus_v11_land'
elevation_data = xr.open_dataarray(elevation_url)
elevation_data = elevation_data.sel(lon = slice(-105, -100), lat = slice(35, 45)).compute()
However, the gridrad dataset is stored in a sparse format, as seen below
url = 'https://rda.ucar.edu/thredds/dodsC/files/g/ds841.1/202006/nexrad_3d_v4_2_20200601T000000Z.nc'
radar_data = xr.open_dataset(url)
print(radar_data)
<xarray.Dataset>
Dimensions: (Longitude: 2832, Latitude: 1248, Altitude: 28,
Sweep: 1902, Index: 4288002, time: 1)
Coordinates:
* Longitude (Longitude) float64 235.0 235.0 235.1 ... 293.9 294.0 294.0
* Latitude (Latitude) float64 24.01 24.03 24.05 ... 49.95 49.97 49.99
* Altitude (Altitude) float64 1.0 1.5 2.0 2.5 ... 19.0 20.0 21.0 22.0
* time (time) datetime64[ns] 2020-06-01
Dimensions without coordinates: Sweep, Index
Data variables:
sweeps_merged (Sweep) |S64 ...
index (Index) int32 ...
Reflectivity (Index) float32 ...
wReflectivity (Index) float32 ...
SpectrumWidth (Index) float32 ...
wSpectrumWidth (Index) float32 ...
datehour (time) int32 ...
Nradobs (Altitude, Latitude, Longitude) int8 ...
Nradecho (Altitude, Latitude, Longitude) int8 ...
...
Where "Index" gives the flattened index location of each variable in a Altitude x Latitude x Longitude
array.
I can extract the data like this:
refl = np.zeros(radar_data['Nradobs'].size)*np.nan
refl[radar_data['index']] = radar_data['Reflectivity']
refl = refl.reshape(radar_data['Nradobs'].shape)
But this requires downloading the entire 'Reflectivity'
grid, which is way more than I actually need.
I can come up with a general method to get just the indices I want, but I'm not sure where to go after this:
min_lon = 240
max_lon = 241
min_lat = 37
max_lat = 38
min_alt = 1
max_alt = 2
alt_inds, lat_inds, lon_inds = np.unravel_index(radar_data['index'], radar_data['Nradobs'].shape)
good_lon_inds = radar_data['Longitude'][lon_inds].to_series().between(min_lon, max_lon).to_numpy()
good_lat_inds = radar_data['Latitude'][lat_inds].to_series().between(min_lat, max_lat).to_numpy()
good_alt_inds = radar_data['Altitude'][alt_inds].to_series().between(min_alt, max_alt).to_numpy()
good_inds = good_lon_inds & good_lat_inds & good_alt_inds
Is there a way to access only a small geographic subset of the whole grid, even though it is saved in this sparse format? Bonus points if you can come up with a way to get a temporal range (by accessing multiple URLs) as well.