I have an xarray.Dataset
with two 1D variables sun_azimuth
and sun_elevation
with multiple timesteps along the time dimension:
import xarray as xr
import numpy as np
ds = xr.Dataset(
data_vars={
"sun_azimuth": ("time", [10, 20, 30, 40, 50]),
"sun_elevation": ("time", [5, 10, 15, 20, 25]),
},
coords={"time": [1, 2, 3, 4, 5]},
)
ds
I also have a function similar to this that I want to apply to each timestep that uses the sun_azimuth
and sun_elevation
to output a 2D array:
def hillshade(x):
"""
Function that takes a timestep of data, and uses
sun_azimuth and sun_elevation values to output a 2D array
"""
dem = np.ones((5, 5))
hillshade_array = dem * x.sun_azimuth.item() + x.sun_elevation.item()
return xr.DataArray(hillshade_array, dims=['y', 'x'])
I know I can apply this func to each timestep like this:
ds.groupby('time').apply(hillshade)
However, the outputs of this function are large and take up a lot of memory. I really want to be able to do this lazily using Dask, so that I can delay the computation of the function until a later stage to reduce peak memory use.
How can I make my hillshade
function return lazy Dask-aware xarray
arrays instead?