I asked this once before and my post was deleted for not providing the code that uses the helper class. This time I have created a full test suite which shows the exact problem.
I am of the opinion that Java's ZipInputStream breaks the Liskov Substitution Principle (LSP) with regards to the InputStream abstract class. For ZipInputStream to be a subtype of InputStream, then objects of type InputStream in a program may be replaced with objects of type ZipInputStream without altering any of the desirable properties of that program (correctness, task performed, etc.).
The way in which LSP is violated here is for the read methods.
InputStream.read(byte[], int, int) states that it returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
The problem with ZipInputStream is that it has modified the meaning of a -1 return value. It states:
the actual number of bytes read, or -1 if the end of the entry is reached
(there is actually a hint to a similar problem with the available method in the Android documentation http://developer.android.com/reference/java/util/zip/ZipInputStream.html)
Now for the code that demonstrates the problem. (This is a cut down version of what I was actually trying to do so please excuse any poor style, multithreading problems, or the fact that the stream is advanced etc.).
Class that accepts any InputStream to generate a SHA1 of the stream:
public class StreamChecker {
private byte[] lastHash = null;
public boolean isDifferent(final InputStream inputStream) throws IOException {
final byte[] hash = generateHash(inputStream);
final byte[] temp = lastHash;
lastHash = hash;
return !Arrays.equals(temp, hash);
}
private byte[] generateHash(final InputStream inputStream) throws IOException {
return DigestUtils.sha1(inputStream);
}
}
Unit tests:
public class StreamCheckerTest {
@Test
public void testByteArrayInputStreamIsSame() throws IOException {
final StreamChecker checker = new StreamChecker();
final byte[] bytes = "abcdef".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertFalse(checker.isDifferent(stream));
}
// Passes
}
@Test
public void testByteArrayInputStreamWithDifferentDataIsDifferent() throws IOException {
final StreamChecker checker = new StreamChecker();
byte[] bytes = "abcdef".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
bytes = "123456".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
// Passes
}
@Test
public void testZipInputStreamIsSame() throws IOException {
final StreamChecker checker = new StreamChecker();
final byte[] bytes = "abcdef".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertFalse(checker.isDifferent(stream));
}
// Passes
}
@Test
public void testZipInputStreamWithDifferentEntryDataIsDifferent() throws IOException {
final StreamChecker checker = new StreamChecker();
byte[] bytes = "abcdef".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
bytes = "123456".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
// Fails here
Assert.assertTrue(checker.isDifferent(stream));
}
}
private ZipInputStream createZipStream(final String entryName,
final byte[] bytes) throws IOException {
try (final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final ZipOutputStream stream = new ZipOutputStream(outputStream)) {
stream.putNextEntry(new ZipEntry(entryName));
stream.write(bytes);
return new ZipInputStream(new ByteArrayInputStream(
outputStream.toByteArray()));
}
}
}
So back to the problem... LSP is violated since you can read to the end of the stream for an InputStream but not for a ZipInputStream and of course this will break the correctness property of any method that tries to use it in such a way.
Is there any way that this can be achieved or is ZipInputStream fundamentally flawed?