2

The example of how to use zarr compression has the following code example see xarray doc:


In [42]: import zarr

In [43]: compressor = zarr.Blosc(cname="zstd", clevel=3, shuffle=2)

In [44]: ds.to_zarr("foo.zarr", encoding={"foo": {"compressor": compressor}})
Out[44]: <xarray.backends.zarr.ZarrStore at 0x7f383eeba970>

The encoding mapping says to apply the given compressor on the "foo" variable. But What if I want to apply to all my variables, not matter how they are named. Would I have to explicitly create the encoding dictionary to match all variables in my Dataset/array or is there some kind of wild-card pattern? I just want to compress the whole Dataset with the same compressor.

marscher
  • 800
  • 1
  • 5
  • 22

1 Answers1

3

If you want to set the same encoding for all of your variables, you can do that with a simple comprehension. When you iterate over a dataset, it'll return the variable names.

Example:

import xarray as xr
import zarr

# test dataset
ds = xr.tutorial.open_dataset("tiny")

# add second variable
ds['tiny2'] = ds.tiny*2

compressor = zarr.Blosc(cname="zstd", clevel=3, shuffle=2)

# encodings
enc = {x: {"compressor": compressor} for x in ds}

# check 
print(enc)

# {'tiny': {'compressor': Blosc(cname='zstd', clevel=3, shuffle=BITSHUFFLE, blocksize=0)}, 'tiny2': {'compressor': Blosc(cname='zstd', clevel=3, shuffle=BITSHUFFLE, blocksize=0)}}


# x is the variable name
ds.to_zarr("foo.zarr", encoding=enc})
Val
  • 6,585
  • 5
  • 22
  • 52