I faced some problems with decompression in zstd case. I have hdf5-format files, that was compressed in the following way:
import h5py as h5
import hdf5plugin
import sys
import os
filefrom = sys.argv[1]
h5path = sys.argv[2]
f = h5.File(filefrom,'r')
data = f[h5path]
shape_data = data.shape[1:]
num = data.shape[0]
initShape = (1,) + shape_data
maxShape = (num,) + shape_data
f_zstd = h5.File(filefrom.split('.')[0]+'_zstd.h5','w')
d_zstd = f_zstd.create_dataset(path_to_data, initShape, maxshape=maxShape, dtype=np.int32, chunks=initShape, **hdf5plugin.Zstd())
d_zstd[0,] = data[0,]
for i in range(num):
d_zstd.resize((i+1,) + shape_data)
d_zstd[i,] = data[i,]
f_zstd.close()
f.close()
So it compressed without any errors, but then when I try to look into the data with h5ls
or h5dump
it prints me out that data can't be printed, and no another way to look inside the file like reading in python3 (3.6) with h5py this compressed data is unsuccessful. I also tried h5repack
(h5repack -i compressed_file.h5 -o out_file.h5 --filter=var:NONE
) or the following piece of code:
import zstandard
import pathlib
import os
def decompress_zstandard_to_folder(input_file):
input_file = pathlib.Path(input_file)
destination_dir = os.path.dirname(input_file)
with open(input_file, 'rb') as compressed:
decomp = zstandard.ZstdDecompressor()
output_path = pathlib.Path(destination_dir) / input_file.stem
with open(output_path, 'wb') as destination:
decomp.copy_stream(compressed, destination)
nothing succeed. In situation with h5repack
no warnings or errors appeared, with the last piece of code I got this zstd.ZstdError: zstd decompressor error: Unknown frame descriptor
, so as I got it means that compressed data doesn't have the appropriete headers.
I use python 3.6.7
, hdf5 1.10.5
. So I'm a bit confused and don't have any idea how to overcome this issue.
Will be happy for any ideas/advice!