2

gzip is documented to support concatenation of compressed files:

$ echo hello >hhh
$ echo world >www
$ cat hhh www
hello
world
$ echo hello | gzip >hhhh
$ echo world | gzip >wwww
$ cat hhhh wwww | gunzip
hello
world

I can create a concatenated file with GZIPOutputStream, but unfortunately GZIPInputStream reads only the first portion of data (gunzip run from the command line reads all.)

I'm seeing this on both Android 4.1.2 and 4.4.2.

How do I read the whole file from Java?

UPDATE:
An example demonstrating the bug (the host version):

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;


class GZTest {
    static void append(File f, String s) {
        try {
            FileOutputStream fos = new FileOutputStream(f, true);
            //FileOutputStream gzos = fos;
            GZIPOutputStream gzos = new GZIPOutputStream(fos);
            gzos.write(s.getBytes("UTF-8"));
            gzos.close(); // TODO: do it finally{}
            fos.close(); // TODO: do it finally{}
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    static String readAll(File f) {
        try {
            FileInputStream fis = new FileInputStream(f);
            //FileInputStream gzis = fis;
            GZIPInputStream gzis = new GZIPInputStream(fis);
            byte[] buf = new byte[4096];
            int len = gzis.read(buf);
            gzis.close(); // TODO: do it finally{}
            fis.close(); // TODO: do it finally{}
            return new String(Arrays.copyOf(buf, len), "UTF-8");
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

}

public class A {
    public static void main(String[] args) {
        System.out.println("~~~");
        File f = new File("x.y");
        f.delete();
        GZTest.append(f, "Hello, ");
        GZTest.append(f, "world!\n");
        System.out.println(GZTest.readAll(f));
    }
}

Running it:

$ javac A.java
$ java A
~~~
Hello, 
$ gunzip <x.y
Hello, world!

UPDATE2
It looks like this is the bug JDK-2192186, reported to be fixed on 2010-08-03. Nevertheless, the bug is here now.

18446744073709551615
  • 16,368
  • 4
  • 94
  • 127
  • Have u tried creating a second 'GZIPInputStream' after the first one finished? – eduyayo Feb 25 '15 at 09:44
  • By the way, look at this: http://stackoverflow.com/questions/13749891/gzipinputstream-fails-to-read-concatenated-gz-files-bug-resolved – eduyayo Feb 25 '15 at 09:50
  • Yeah... BTW, I'm seeing the same bug on the host with `java version "1.8.0_31"` The bug report offers a workaround, but the code is terrible. Did Oracle publish the code with the fix applied? – 18446744073709551615 Feb 25 '15 at 10:08

2 Answers2

0

As to the host java, the right way to read is:

static String readAll(File f) {
    try {
        FileInputStream fis = new FileInputStream(f);
        //FileInputStream gzis = fis;
        GZIPInputStream gzis = new GZIPInputStream(fis);
        final int SIZE=4096;
        byte[] buf = new byte[SIZE];
        int len=0, read=0;
        do {
            read = gzis.read(buf, len, SIZE-len);
            if (read < 0) {
                break;
            }
            len += read;
        } while (len<SIZE);
        gzis.close(); // TODO: do it finally{}
        return new String(Arrays.copyOf(buf, len), "UTF-8");
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}

(this readAll() is to be used in place of readAll() in the question)

This code does not work on Android!!!

On Android, this still gives

D/~~~     (30098): Hello, 

while the command line gunzip says

# cat /data/data/com.example.gzctest2/files/x.y | gunzip
Hello, world!
18446744073709551615
  • 16,368
  • 4
  • 94
  • 127
0

Have just tried the solution from truncated output from GZIPInputStream on Android: it works. To clone the branch with the fix, https://github.com/ymnk/jzlib/tree/concatenated_gzip_streams , use:

git clone https://github.com/ymnk/jzlib.git
git checkout concatenated_gzip_streams

and then copy the directory src/main/java (you cannot copy just one file) to your project and replace the import:

//import java.util.zip.GZIPInputStream;
import com.jcraft.jzlib.GZIPInputStream;

It works on Android!

If you want to remove unneeded files, you will need: Adler32.java Deflate.java GZIPInputStream.java Inflate.java InfTree.java Tree.java Checksum.java GZIPException.java InfBlocks.java InflaterInputStream.java JZlib.java ZStream.java CRC32.java GZIPHeader.java InfCodes.java Inflater.java StaticTree.java
and will not need: Deflater.java DeflaterOutputStream.java GZIPOutputStream.java ZInputStream.java ZOutputStream.java ZStreamException.java

Looks like removing 6 unused files (24K) while leaving 17 files (198K) is not worth the result.

Community
  • 1
  • 1
18446744073709551615
  • 16,368
  • 4
  • 94
  • 127