3

I have a ZipInputStream that contains a number of XML files that I want to apply a transform to. The following piece of code loads the XSLT and ZIP files and loops through the ZIP entries, attempting to apply the transform to each one. However it appears the transform function is closing the input stream after performing the transform, causing the getNextEntry() function to fail because the stream is closed.

Is there is a simple way around this problem (to keep the input stream open allowing the ZipInputStream to move to the next entry) or am I missing something more fundamental here?

TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new StreamSource(xsltFileName));

FileInputStream fis = new FileInputStream(fileName);
ZipInputStream zis = new ZipInputStream(fis);
ZipEntry ze = null;

while ((ze = zis.getNextEntry()) != null)
{
    String newFileName = ze.getName();
    transformer.transform(new StreamSource(zis), new StreamResult(new FileOutputStream(newFileName)));
}

I have attempted to search for a solution but don't seem to be coming up with anything that makes sense. I'd appreciate any ideas or feedback.

clyde
  • 33
  • 1
  • 4

5 Answers5

5

One possible solution is to extend ZipInputStream (it's not final) and override the close method to do nothing. Of course you need to make sure then to close it your self. You can do that with a second custom close method that simply calls super.close().

class MyZIS extends ZipInputStream {

    public MyZIS(InputStream in) {
        super(in);
    }

    @Override
    public void close() throws IOException {
    }

    public void myClose() throws IOException {
        super.close();
    }
}
Jilles van Gurp
  • 7,927
  • 4
  • 38
  • 46
  • This code has the desired effect and prevents the stream from being closed. I will select this as my chosen answer as it makes the solution to the problem very clear. Micheal had mentioned this as a possible workaround in his answer. – clyde Oct 20 '12 at 06:36
  • This solution is incomplete. ZipInputStream wraps an Inflater, and close() normally calls Inflater.end(), which releases native resources. If you override close(), then you need `if (!closed) { inf.end(); closed = true; }` with a `private boolean closed` member variable. Otherwise, the native resources will not be released until the Inflater is garbage collected and finalized, which might not happen before a native OutOfMemoryError occurs. – Brett Kail Jul 09 '15 at 22:27
  • On further thought, the ZipInputStream javadoc does not prohibit a future implementation from pooling the Inflater instances, so it's probably safest to use CloseShieldInputStream or `new ZipInputStream(new FilterInputStream(in) { public void close() {} })` or similar. – Brett Kail Jul 09 '15 at 22:41
1

Generally the accepted protocol is that "he who creates an input stream should close it after use" and it appears your XSLT processor (Xalan?) isn't following this convention. If that's the case, then a workaround (apart from moving to a different XSLT processor!) is to write a filter stream that wraps the ZipInputStream and passes on all calls to the underlying ZipInputStream, except for the close() call which it intercepts.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Yes, it is Xalan, I should have mentioned that for completeness. That protocol is what I would have expected but doesn't appear to be the case. I will quickly try implementing the work around and also another processor and provide feedback. – clyde Oct 19 '12 at 14:25
  • 1
    @clyde Apache commons-io provides a [CloseShieldInputStream](http://commons.apache.org/io/api-release/org/apache/commons/io/input/CloseShieldInputStream.html) for precisely this situation. – Ian Roberts Oct 19 '12 at 14:27
  • It appears the Saxon processor has the same behaviour as Xalan. I have implemented a class that extends ZipInputStream, overriding the close method. This works as desired. – clyde Oct 20 '12 at 06:32
  • This is how the SAX and JAXP librairies have been designed as [InputSource](https://docs.oracle.com/javase/8/docs/api/?org/xml/sax/InputSource.html) creation and consumption are unrelated (especially for [EntityResolvers](https://docs.oracle.com/javase/8/docs/api/?org/xml/sax/EntityResolver.html)) The behaviour is perfectly explained from [InputSource](https://docs.oracle.com/javase/8/docs/api/?org/xml/sax/InputSource.html) Javadoc : ``However, standard processing of both byte and character streams is to close them on as part of end-of-parse cleanup`` – LoganMzz Sep 23 '15 at 08:01
0

You should actually be using the class ZipFile for reading the zip archive. Then you get the inputstream for the zip entry like this:

zipfile.getInputStream(zipEntry);
Zagrev
  • 2,000
  • 11
  • 8
  • 1
    `ZipFile` is great when you need random access by entry name, but when you just want to read all the entries in sequence without knowing in advance what each one is called then it has no advantage over `ZipInputStream`. The stream may be better in fact as it doesn't have to parse the whole central directory up front. – Ian Roberts Oct 19 '12 at 14:56
  • There is entries() that returns the enumeration of all the entries in a file. And, the structure of a zip file is that the directory is the last thing in the file, so if you have read any entry, you have read the whole file... – Zagrev Oct 19 '12 at 15:06
  • The central directory is at the end of the file, yes, but each entry also has a local header. `ZipInputStream` does a single pass, its `ZipEntry` objects are constructed from the local headers rather than the central directory. – Ian Roberts Oct 19 '12 at 15:20
  • 1
    Unfortunately my source is not actually a file, it is another stream. I just provided the code as a sample to get an idea of how to solve the problem I am seeing. I am going to try some of the other suggestions and provide feedback. – clyde Oct 19 '12 at 15:56
-1

Perhaps what you need to do is then read the zip input into a temporary buffer, then use that as the source to the transformer. My understanding is that the transformer needs to read the entire stream to determine what the transform should be, therefore, even if it didn't close the input stream, the next read would hit EOF.

Perhaps something like this? (no optimization has been done)

    byte[] bytes = new byte[(int) entry.getSize()];
    zis.read(bytes);
    ByteArrayInputStream out = new ByteArrayInputStream(bytes);
    transformer.transform(new StreamSource(zis), new StreamResult(new FileOutputStream(newFileName)));    }
Zagrev
  • 2,000
  • 11
  • 8
-1

There is a property you can set : IsStreamOwner, when this is false the underlying stream will not be closed.

Charlie
  • 51
  • 1
  • 3