3

I've tried implementing Java version of LZ4 into a search engine kind of program trying to search data from large text files. I simply compressed the outputstream and stored it into txt files or files without names. However, I realized the supposedly compressed files did not reduce in size, but it's even larger in size than original files.

At last I had to resort to zip4j since it works for me.

I wonder how may I approach using jars of LZ4 or Snappy to compress/decompress correctly?

In addition, how may I use such algorithms to compress a single folder with many files inside?

Thanks!

kdenz
  • 671
  • 1
  • 7
  • 16
  • What did you try that did not work? Conceptually you simply wrap the FileOutputStream with an OutputStream that provides the compression then write to that stream. If you are writing text, you likely would want to wrap that with an OutputStreamWriter or PrintWriter. https://oss.sonatype.org/service/local/repositories/releases/archive/org/xerial/snappy/snappy-java/1.1.0/snappy-java-1.1.0-javadoc.jar/!/org/xerial/snappy/SnappyFramedOutputStream.html – Brett Okken Jun 16 '14 at 19:44

2 Answers2

5

I faced a similar problem. I was trying to send a large file (~ 709 MB) over local network in chunks of 8192 bytes. I used Lz4 compression/decompression to reduce the network bandwidth.

So assuming you are trying to do something similar, here's my suggestion :

Here's the snippet of similar regular example you'll find on https://github.com/jpountz/lz4-java

private static int decompressedLength;
private static LZ4Factory factory = LZ4Factory.fastestInstance();
private static LZ4Compressor compressor = factory.fastCompressor();

public static byte[] compress(byte[] src, int srcLen) {
    decompressedLength = srcLen;
    int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
    byte[] compressed = new byte[maxCompressedLength];
    compressor.compress(src, 0, decompressedLength, compressed, 0, maxCompressedLength);
    return compressed;
}

Now if you return the compressed byte array as it is then there are fair chances that it may have length greater than the original uncompressed data.

So you can modify it as follows :

private static int decompressedLength;
private static LZ4Factory factory = LZ4Factory.fastestInstance();
private static LZ4Compressor compressor = factory.fastCompressor();

public static byte[] compress(byte[] src, int srcLen) {
    decompressedLength = srcLen;
    int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
    byte[] compressed = new byte[maxCompressedLength];
    int compressLen = compressor.compress(src, 0, decompressedLength, compressed, 0, maxCompressedLength);
    byte[] finalCompressedArray = Arrays.copyOf(compressed, compressLen);
    return finalCompressedArray;
}

compressLen stores the actual compressed length and the finalCompressedArray byte array (of length compressLen) stores the actual compressed data. It's length, in general, is less than both the lengths of compressed byte array and original uncompressed byte array

Now you can decompress the finalCompressedArray byte array in regular fashion as below :

private static LZ4FastDecompressor decompressor = factory.fastDecompressor();

public static byte[] decompress(byte[] finalCompressedArray, int decompressedLength) {
    byte[] restored = new byte[decompressedLength];
    restored = decompressor.decompress(finalCompressedArray, decompressedLength);
    return restored;
}
Robbie Aldrich
  • 135
  • 1
  • 7
Ankit
  • 1,240
  • 2
  • 13
  • 16
  • Sorry for acknowledging sooo late! – kdenz Sep 23 '15 at 17:12
  • 2
    @Ankit- If I don't know the size of byte array after decompressing, what will be decompressedLength? – ketan Jul 29 '17 at 12:03
  • You have to store it somewhere or transfer as well. Lz4 has example code, it's called something like OutputStreamWithLength, where it stores 4 bytes with uncompressed size first. – razor Dec 13 '22 at 12:50
1

A .jar file is a .zip file. The zip file format does not support LZ4 or Snappy.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • 1
    Oops, wrong question! : P I simply meant how may I use LZ4 or Snappy for compressing large text files – kdenz May 20 '14 at 03:45