ZipArchiveInputStream
vs ZipFile
It appears that ZipArchiveInputStream
has some limitations as stated by the official documentation:
ZIP archives store a archive entries in sequence and contain a
registry of all entries at the very end of the archive. It is
acceptable for an archive to contain several entries of the same name
and have the registry (called the central directory) decide which
entry is actually to be used (if any).
In addition the ZIP format stores certain information only inside the
central directory but not together with the entry itself, this is:
- internal and external attributes
- different or additional extra fields
This means the ZIP format cannot really be parsed correctly while
reading a non-seekable stream, which is what ZipArchiveInputStream
is forced to do. As a result ZipArchiveInputStream
- may return entries that are not part of the central directory at all and shouldn't be considered part of the archive.
- may return several entries with the same name.
- will not return internal or external attributes.
- may return incomplete extra field data.
ZipArchiveInputStream
shares these limitations with
java.util.zip.ZipInputStream
.
ZipFile
is able to read the central directory first and provide
correct and complete information on any ZIP archive.
ZIP archives know a feature called the data descriptor which is a way
to store an entry's length after the entry's data. This can only work
reliably if the size information can be taken from the central
directory or the data itself can signal it is complete, which is true
for data that is compressed using the DEFLATED compression algorithm.
ZipFile
has access to the central directory and can extract entries
using the data descriptor reliably. The same is true for
ZipArchiveInputStream
as long as the entry is DEFLATED. For STORED
entries ZipArchiveInputStream
can try to read ahead until it finds
the next entry, but this approach is not safe and has to be enabled by
a constructor argument explicitly.
Conclusion:
If possible, you should always prefer ZipFile
over ZipArchiveInputStream
.
I believe, by ZipFile
the above sentence means the use of InputStream
created using a ZipFile
:
InputStream is = zipFile.getInputStream(zipArchiveEntry);