Questions tagged [zarr]

Zarr is a Python package providing an implementation of compressed, chunked, N-dimensional arrays, designed for use in parallel computing.

Zarr is a Python package providing an implementation of compressed, chunked, N-dimensional arrays (like NetCDF4, HDF5), designed for use in parallel computing and on the Cloud. See http://zarr.readthedocs.io/en/stable/ for more information.

93 questions

votes

1 answer

zarr consolidate_metadata errors with TypeError: memoryview: a bytes-like object is required, not 'Array'

I'm trying to consolidate the metadata of an existing zarr store, though the same error occurs if I make a new zarr store and call zarr.consolidate_metadata(store). Code example: import zarr ## create test zarr store path_to_store =…

python zarr

asked Aug 11 '21 at 17:40

Adair

1,697
18
22

votes

1 answer

xarray loading int data as float

Say I create a dataset with an integer variable. import xarray as xr import numpy as np int_var = np.random.randint(0, 10, 10) ds = xr.Dataset(data_vars={"int_var": (("x"), int_var)}, coords={"x": range(10)}) Then I save it,…

python-xarray zarr

asked Jul 20 '21 at 19:42

Adair

1,697
18
22

votes

1 answer

Is it possible to store multiple groups and arrays in a single file?

I've been using HDF5 to store time series data and I want to try using Zarr due to its various features. I'm reading its tutorial and following each step, and I've realized that maybe Zarr uses directories on a file system instead of a single file…

zarr

asked Jun 17 '21 at 22:42

maynull

1,936
4
26
46

votes

2 answers

How can I rename a Zarr array without writing new store?

I have a Zarr datastore, but I need to rename one of the dimensions. Let's say I have this (from xarray docs): data = np.random.rand(4, 3) locs = ["IA", "IL", "IN"] times = pd.date_range("2000-01-01", periods=4) da = xr.DataArray(data,…

python python-xarray zarr

asked May 24 '21 at 21:51

j sad

1,055
9
16

votes

2 answers

Transform zarr directory storage to zip storage

codes: store = zarr.ZipStore("/mnt/test.zip", "r") Problem description: Hi, sry for bothering, I found this statement inside Zarr official documentation about ZipStorage: Alternatively, use a DirectoryStore when writing the data, then manually Zip…

python compression zarr

asked May 21 '21 at 10:55

eddie Gao

votes

1 answer

Does Zarr has built-in multi-threading support for fast read and write?

I am trying to speed up reading and writing Zarr files using multi-threading. For example, if I can store an array in 5 chunks, is there a way to use a thread per chunk to speed up reading and writing the array to and from disk (possibly using…

multithreading parallel-processing zarr

asked Apr 27 '21 at 05:24

Ali Jooya

votes

1 answer

verify that Zarr has been fully installed by running the test suite

suggested cmds on Zarr: $ pip install pytest $ python -m pytest -v --pyargs zarr what I tried to make it work: $ pip3 install pytest (succeeded) $ python3.7 pytest -v --pyargs zarr Error I…

python terminal pytest macos-catalina zarr

asked Nov 15 '20 at 13:38

xavierdhollander

votes

1 answer

Memory leak issue using PyTorch IterableDataset with zarr

I'm trying to build a pytorch project on an IterableDataset with zarr as storage backend. class Data(IterableDataset): def __init__(self, path, start=None, end=None): super(Data, self).__init__() store =…

pytorch zarr

asked Jul 31 '20 at 13:13

sobek

1,386
10
28

votes

1 answer

Dask looping overhead from libraries

When calling another libary to dask such as scikit image contrast stretch, I realise that dask is creating a result for each block, storing in either memory or spilling to disk seperately. Then it attempts to merge all the results. Thats fine if…

dask scikit-image zarr

asked Jul 23 '20 at 07:09

Ouetis_Khan

votes

1 answer

Limit memory footprint when storing `dask.array.map_blocks` output

Consider a 2D array X to large to fit in memory--in my case it's stored in the Zarr format but that doesn't matter. I would like to map a function block-wise over the array and save the result without having ever loading the entire array into…

python dask zarr

asked May 28 '20 at 22:13

Richard Border

3,209
16
30

votes

1 answer

Efficient way of storing 1TB of random data with Zarr

I'd like to store 1TB of random data backed by a zarr on disk array. Currently, I am doing something like the following: import numpy as np import zarr from numcodecs import Blosc compressor = Blosc(cname='lz4', clevel=5,…

numpy zarr

asked May 14 '20 at 21:45

quasiben

1,444
1
11
19

votes

0 answers

How can one write lock a zarr store during append?

Is there some way to lock a zarr store when using append? I have already found out the hard way that using append with multiple processes is a bad idea (the batches to append aren't aligned with the batch size of the store). The reason I'd like to…

python zarr

asked May 14 '20 at 14:18

sobek

1,386
10
28

votes

1 answer

Display all variants

I have a 2GB vcf DNA file and I am trying to use vcf_to_zarr() to print out all the variant with all fixed fields but I am getting the error KeyError: 'variants/*' allel.vcf_to_zarr import allel import numcodecs import zarr def readVcf(): …

python vcf-vcard zarr

asked Sep 21 '19 at 11:33

user11766958

votes

1 answer

Zarr multithreaded reading of groups

Not sure if this question makes sense/is relevant wrt zarr. I'm storing zarr data on disk in groups so for example I have group = zarr.group() d1 = group.create_dataset('baz', shape=100, chunks=10) d2 = group.create_dataset('foo', shape=100,…

python multithreading hdf zarr

asked Jun 26 '19 at 21:24

Michael

7,087
21
52
81

votes

2 answers

How to create .mdb file?

I am new with zarr, HDF5 and LMDB. I have converted data from HDF5 to Zarr but i got many files with extension .n (n from 0 to 31). I want to have just one file with .zarr extension. I tried to use LMDB (zarr.LMDBStore function) but i don't…

python hdf5 lmdb zarr

asked Jun 21 '19 at 10:48

Fatma RAHMANI

Prev 1 2 3 4 5

7 Next