1

I have a valid zip file on my classpath (Java 8). It is 302617 bytes long. I would like to copy it to a temp folder for expansion and further processing in my application, using standard ApacheCommons IO Utils. If I read it as a file, e.g.:

    File out = new File("out.zip");
    File in = new File ("src/main/resources/StartUpData/c4.zip");
    try (InputStream is = new FileInputStream(in);
               FileOutputStream fos = new FileOutputStream(out)   ) {
          IOUtils.copy(is, fos);
          System.out.println(out.length());
    }

this works exactly as expected - printing out 302617.

If however I read from classpath input stream:

 try (InputStream is2 = this.getClass().getResourceAsStream("/StartUpData/c4.zip");         
        FileOutputStream fos = new  FileOutputStream(out)) {
        IOUtils.copy(is2, fos);
        System.out.println(out.length());
    }

it generates a file of 544115 bytes. It is not valid zip format, it cannot be unzipped or read as a zip file by any command line zip utils or Java. I only observe this behaviour with zip files; for other binary files or images both approaches work fine.

I investigated the bytes being read in both cases. Here are first 12 bytes of the file, from xxd -b c4.zip:

00000000: 01010000 01001011 00000011 00000100 00010100 00000000    PK....
00000006: 00001000 00001000 00001000 00000000 10111010 10011110  ......

The 11th and 12 bytes in the file (10111010 10011110, hex ba 9e) are read from the classpath input stream as hex ef bf.

In fact, any byte with the first bit set to 1 is misread by the input stream created by

this.getClass().getResourceAsStream("/StartUpData/c4.zip")

Does anyone know why this happens only for zip files being read from the classpath? How can 10111010 10011110 be interpreted as ef bf ? Many thanks for any suggestions. I am using MacOS High Sierra, my colleague also observes this behaviour on Windows 10.

otter606
  • 335
  • 1
  • 4
  • 21
  • `src` is not on the CLASSPATH. It isn't there at runtime at all. You should be accessing the ZIP as a resource, not as a file. – user207421 Feb 13 '19 at 11:55
  • 2
    it is on the classpath, and is there at runtime, and is being read. That's exactly what I'm trying to do, accessing as a resource. – otter606 Feb 13 '19 at 11:56
  • On the face of it, this behavior seems impossible. Can you please provide us with an MCVE we can use to reproduce it for ourselves. – Stephen C Feb 13 '19 at 12:01
  • 1
    Is the classpath resource coming directly from the file system, or did you package your classes and resources into a jar file? In the latter case, the problem may have occurred while you were creating the jar file - that process may have corrupted the zip file. Trying unpacking the jar file and see if the zip file is corrupted then. – Erwin Bolwidt Feb 13 '19 at 12:04
  • I had troubles in the past with try-with-resources block. Have you tried to not use it? Get the resource inside the try and close the stream at the end. – Victor Calatramas Feb 13 '19 at 12:34
  • thanks for the suggestions. I copied test class and file to new project to post on GitHub, using same project structure etc and works as expected. So it must be something about the project setup, no idea what though, as both are standard maven projects with same JRE. – otter606 Feb 13 '19 at 12:41
  • removing try-with-resources made no difference – otter606 Feb 13 '19 at 12:43

1 Answers1

1

This was a maven filtering issue, see https://maven.apache.org/plugins/maven-resources-plugin/examples/binaries-filtering.html for solution. Adding zip as an exclusion fixed this, and the zip file can be anywhere on the classpath

otter606
  • 335
  • 1
  • 4
  • 21