0

I have to unzip a .gz file for which I am using following code:

FileInputStream fis = null;
        FileOutputStream fos = null;
        GZIPInputStream gin = null;
        try {
            File file = new File(getPath(), zipName);
            fis = new FileInputStream(file);
            gin = new GZIPInputStream(fis);
            File newFile = // some initialization related to path and name
            fos = new FileOutputStream(getPath());
            byte[] buf = new byte[1024];
            int len;
            while ((len = gin.read(buf)) > 0) {
                fos.write(buf, 0, len);
            }
            gin.close();
            fos.close();

        //newFile is ready
        } catch(IOException e){
                // exception catch
        }

However when the client gz file is corrupt I am getting following error :

  Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
    at java.util.zip.InflaterInputStream.fill(Unknown Source)
    at java.util.zip.InflaterInputStream.read(Unknown Source)
    at java.util.zip.GZIPInputStream.read(Unknown Source)
    at java.util.zip.InflaterInputStream.read(Unknown Source)

Surprisingly the file is still getting ungzipped and store at a local location. I do not want a corrupt file to be ungzipped or processed further at all.

One way could be to delete the newFile object when catch clause is hit with java.io.EOFException, but is that a right approach? There could be some other possible exceptions as well when the file wasnt corrupt.

coretechie
  • 1,050
  • 16
  • 38
  • 1
    If you get *any* `IOException` for any reason, from corrupt input to the disk flying into orbit around Pluto, you should delete the output file and treat the operation as a failure. You don't have to think beyond that. – user207421 Sep 09 '15 at 10:18
  • 1
    why are you limiting your buffer to a constant size? That doesnt make any sense and WILL create buffer overflows at some point. Use a `ByteArrayOutputStream` instead - the underlying buffer will grow as needed. In fact i think [PipedInputStream](https://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html) and [PipedOutputStream](https://docs.oracle.com/javase/7/docs/api/java/io/PipedOutputStream.html) are more suited towards your needs – specializt Sep 09 '15 at 10:21
  • @specializt Why *not*? Why assume everything will fit into memory? Why add the latency? Why add the extra code? Why not just write directly to the target and deal with this exceptional cases as they arise? What value do the piped streams add? – user207421 Sep 09 '15 at 10:48
  • Well you're obviously new to programming and there are a lot of "why"s - i shall try to explain it : Limiting buffers like that will also limit performance - in your case it will be quite significant since uncompressed data is usually quite a bit larger and will reduce the possible amount of input segment size. There is no "latency" involved if you use ByteArrayOutputstream, in fact it will be FASTER. There is also no "extra code" since its only a replacement of a declaration - and if you actually worry about "extra code" in java ... well ... you're doing it wrong. – specializt Sep 09 '15 at 10:59
  • There is also no "direct" write-through in the above example, literally everything is buffered and copied back and forth - and there are no "exceptional cases" (other than IOException) checked for, its pretty much the most primitive solution one can think of - without error handly of any kind. Piped streams are somewhat of a "direct" passing since they do not have any relevant overhead - except for those which are necessary, it is ALWAYS a good idea to re-use given JRE/JDK tools and not re-invent the wheel. If it is already there ... use it and make your life MUCH easier. – specializt Sep 09 '15 at 11:03

1 Answers1

2

Reference @EJP

If you get any IOException for any reason, from corrupt input to the disk flying into orbit around Pluto, you should delete the output file and treat the operation as a failure. You don't have to think beyond that. – EJP

coretechie
  • 1,050
  • 16
  • 38