1

I have an existing LMDB zarr archive (~6GB) saved at path. Now I want to consolidate the metadata to improve read performance.

Here is my script:

store = zarr.LMDBStore(path)
root = zarr.open(store)
zarr.consolidate_metadata(store)
store.close()

I get the following error:

Traceback (most recent call last):
  File "zarr_consolidate.py", line 12, in <module>
    zarr.consolidate_metadata(store)
  File "/local/home/marcel/.virtualenvs/noisegan/local/lib/python3.5/site-packages/zarr/convenience.py", line 1128, in consolidate_metadata
    return open_consolidated(store, metadata_key=metadata_key)
  File "/local/home/marcel/.virtualenvs/noisegan/local/lib/python3.5/site-packages/zarr/convenience.py", line 1182, in open_consolidated
    meta_store = ConsolidatedMetadataStore(store, metadata_key=metadata_key)
  File "/local/home/marcel/.virtualenvs/noisegan/local/lib/python3.5/site-packages/zarr/storage.py", line 2455, in __init__
    d = store[metadata_key].decode()  # pragma: no cover
AttributeError: 'memoryview' object has no attribute 'decode'

I am using zarr 2.3.2 and python 3.5.2. I have another machine running python 3.6.2 where this works. Could it have to do with the python version?

mcb
  • 398
  • 2
  • 12

1 Answers1

1

Thanks for the report. Should be fixed with gh-452. Please test it out (if you are able).

If you are able to share a bit more information on why read performance suffers in your case, that would be interesting to learn about. :)

jakirkham
  • 685
  • 5
  • 18
  • Thanks a lot, it works now. I use zarr to load images and feed them to a neural network. So I need a high reading capacity. The performance is quite good already, but in order to efficiently use the GPUs, it is not fast enough without consolidating metadata. – mcb Jul 10 '19 at 09:24
  • Thanks for the info. If you are able to share any benchmark data about what is slow in a GitHub issue, that would be interesting to investigate further ;) – jakirkham Jul 11 '19 at 05:27