1

I was wondering why a h5 file takes larger space in the harddisk than a normal mat file while the contents are same. I always thought the h5 is the sort of compressed one. The details are below

Using a Matlab 2014b in a 64 bit linux-ubuntu

code 1:

clear,clc
h5create('myfile.h5','/DS1',[900 9000]);
mydata = rand(900,9000);
h5write('myfile.h5', '/DS1', mydata);
data = h5read('myfile.h5','/DS1');

code 2

clear,clc
a=rand(900,9000);
save a a;

The size of the mat file is 2 mb less than the h5 file (61 mb), are there any flags that I am ignoring for h5 saving process?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user2375049
  • 350
  • 2
  • 15

1 Answers1

1

Per default HDF5 is uncompressed, but it supports different compression filters. gzip is implemented in matlab, you can simply enable it by setting a level higher than 0.

h5create('myfile_gzip.h5','/DS1',[900 9000],'Deflate',9,'ChunkSize',[100,100]);
h5write('myfile_gzip.h5', '/DS1', a);

For ChuckSize I made a stupid guess which luckily came out with good results, maybe you can try other values if you experience bad results.

Daniel
  • 36,610
  • 3
  • 36
  • 69
  • not sure if i was not following the code correctly but the exact code yields a 58 mb data which is the same with the default mat file (code 2 output of my initial question) i try to play around with the chunk size did not yield any better. – user2375049 Dec 25 '14 at 23:03
  • @user2375049: MAT-Files are compressed as well, so reaching about the same size is what I accused to be "good". Such random data is hard to compress, you can't expect any wonders. – Daniel Dec 25 '14 at 23:28