I'm planning to compress the Oracle export dump files of all Oracle database servers to reduce space usage on their disks.
My question is: Do you know of any good compression utility/algorithm for compressing this type of file?
I'm planning to compress the Oracle export dump files of all Oracle database servers to reduce space usage on their disks.
My question is: Do you know of any good compression utility/algorithm for compressing this type of file?
I have done some comparisons using a 4800 MB .DMP file. Resulting in:
Then I have done some tests on a smaller .DMP file, of 72 MB:
All these tests have been done with 7Zip, and the compression levels refer to the default settings as available in 7Zip.
Some conclusions and observations so far
LZMA and LZMA2 seem to be the best options for .DMP files. I have seen other people report this too in other forums. In 7zip, LZMA only supports 2 CPU's, and LZMA2 supports more (will also use more memory). Since the compression results look identical, selecting LZMA2 should result in faster compression.
Now let's play with word and dictionary size for ultra compressed LZMA2 files...
Final conclusions
7zip, 7z, LZMA2, or however you name it, is the way to go.
If you have the patience, going for ultra compression pays off. Going for an even higher word size also pays off, but going for a bigger dictionary size might not (in this test case).
I'd be curious to get some averages of compression ratios of people trying this. I'll try to remember posting results once I'm done compressing some bigger archives.
New conclusions
I've been fine-tuning this, and had the best results time/ratio wise with these settings:
Oracle dumps files use a proprietary binary format. So you want a standard compression that is good for binary data. bzip2 is good with binary data, so I would recommend that.
HERE is a great article I used when making a backup script on some linux machines. Worked really well. I ended up using rzip for my needs. I used it to backup and compress all types of data files including SQL database dumps.
Building on the @Wouter answer I was able to acheive a good result with the following options (approximately 3 times smaller dictionary and 10 times larger word size - can't confirm its impact on the compression time):
7z a -r -ms=e4g -mqs+ -m0=LZMA2:a=0:d=18m:mf=hc4:fb=128:mc=5:lc=4:lp=0:pb=0 -bt outfile.7z @items_to_archive.txt
It gives the compression ratio of 0.12 (compressed sizes / original size) and it's 18...20% faster than -mx4
. Higher compression can be achieved with a=1
, higher values for d
, fb
and mc
at the expense of increased compression time.
For comparison, WinRar "Best" method with default options results in 0.11 compression ratio and is approximately 6x slower, "Fastest" method gives 0.17 compression and is 1.5 times faster. Rar archives are 3x faster to extract though.
I'd first try using 7zip with ultra compression level and see what you can get. I definitely get better results from .7z than the .zip format...