0

Context

In the section Appending to existing Zarr stores, the example is as follows

import xarray as xr
import dask.array

# Write zarr with empty structure
dummies = dask.array.zeros(30, chunks=10)
ds = xr.Dataset({"foo": ("x", dummies)})
path = "path/to/directory.zarr"

ds.to_zarr(path, compute=False)

# Append
ds = xr.Dataset({"foo": ("x", np.arange(30))})
ds.isel(x=slice(0, 10)).to_zarr(path, region={"x": slice(0, 10)})

Question

The example works fine as long as I know the integer slices of the array.

How do I append if I do not know the required region? That is, if I'm given the result of ds.isel(x=slice(0, 10)) without knowing the region slice(0, 10)?

Possible solutions

In theory, I have all the information from the coordinates. For example, for float coordinates, I could do something like

start_index = (ds['x']>=start).argmax().values.item()
end_index = (ds['x']<=end).argmin().values.item()
# region is slice(start_index, end_index)

to determine the isel/zarr indices.

However, this gets fairly involved when dealing with a dataset with many dimensions and coordinates of various types (float, string, datetime). This makes me wonder if there is a more straightforward way.

Dahn
  • 1,397
  • 1
  • 10
  • 29

1 Answers1

1

This is currently not implemented in Xarray, but it has been requested as a feature: https://github.com/pydata/xarray/issues/7702

That issue also explains a workaround very similar to the one you used here.

It seems like prioritizing this would be a good idea for the Xarray team.

Ryan
  • 766
  • 6
  • 13