2

I asked this once before and my post was deleted for not providing the code that uses the helper class. This time I have created a full test suite which shows the exact problem.

I am of the opinion that Java's ZipInputStream breaks the Liskov Substitution Principle (LSP) with regards to the InputStream abstract class. For ZipInputStream to be a subtype of InputStream, then objects of type InputStream in a program may be replaced with objects of type ZipInputStream without altering any of the desirable properties of that program (correctness, task performed, etc.).

The way in which LSP is violated here is for the read methods.

InputStream.read(byte[], int, int) states that it returns:

the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.

The problem with ZipInputStream is that it has modified the meaning of a -1 return value. It states:

the actual number of bytes read, or -1 if the end of the entry is reached

(there is actually a hint to a similar problem with the available method in the Android documentation http://developer.android.com/reference/java/util/zip/ZipInputStream.html)

Now for the code that demonstrates the problem. (This is a cut down version of what I was actually trying to do so please excuse any poor style, multithreading problems, or the fact that the stream is advanced etc.).

Class that accepts any InputStream to generate a SHA1 of the stream:

public class StreamChecker {

    private byte[] lastHash = null;

    public boolean isDifferent(final InputStream inputStream) throws IOException {
        final byte[] hash = generateHash(inputStream);
        final byte[] temp = lastHash;
        lastHash = hash;
        return !Arrays.equals(temp, hash);
    }

    private byte[] generateHash(final InputStream inputStream) throws IOException {
        return DigestUtils.sha1(inputStream);
    }
}

Unit tests:

public class StreamCheckerTest {

    @Test
    public void testByteArrayInputStreamIsSame() throws IOException {
        final StreamChecker checker = new StreamChecker();
        final byte[] bytes = "abcdef".getBytes();
        try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
            Assert.assertTrue(checker.isDifferent(stream));
        }
        try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
            Assert.assertFalse(checker.isDifferent(stream));
        }
        // Passes
    }

    @Test
    public void testByteArrayInputStreamWithDifferentDataIsDifferent() throws IOException {
        final StreamChecker checker = new StreamChecker();
        byte[] bytes = "abcdef".getBytes();
        try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
            Assert.assertTrue(checker.isDifferent(stream));
        }
        bytes = "123456".getBytes();
        try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
            Assert.assertTrue(checker.isDifferent(stream));
        }
        // Passes
    }

    @Test
    public void testZipInputStreamIsSame() throws IOException {
        final StreamChecker checker = new StreamChecker();
        final byte[] bytes = "abcdef".getBytes();
        try (final ZipInputStream stream = createZipStream("test", bytes)) {
            Assert.assertTrue(checker.isDifferent(stream));
        }
        try (final ZipInputStream stream = createZipStream("test", bytes)) {
            Assert.assertFalse(checker.isDifferent(stream));
        }
        // Passes
    }

    @Test
    public void testZipInputStreamWithDifferentEntryDataIsDifferent() throws IOException {
        final StreamChecker checker = new StreamChecker();
        byte[] bytes = "abcdef".getBytes();
        try (final ZipInputStream stream = createZipStream("test", bytes)) {
            Assert.assertTrue(checker.isDifferent(stream));
        }
        bytes = "123456".getBytes();
        try (final ZipInputStream stream = createZipStream("test", bytes)) {
            // Fails here
            Assert.assertTrue(checker.isDifferent(stream));
        }
    }

    private ZipInputStream createZipStream(final String entryName,
            final byte[] bytes) throws IOException {
        try (final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                final ZipOutputStream stream = new ZipOutputStream(outputStream)) {
            stream.putNextEntry(new ZipEntry(entryName));
            stream.write(bytes);
            return new ZipInputStream(new ByteArrayInputStream(
                    outputStream.toByteArray()));
        }
    }
}

So back to the problem... LSP is violated since you can read to the end of the stream for an InputStream but not for a ZipInputStream and of course this will break the correctness property of any method that tries to use it in such a way.

Is there any way that this can be achieved or is ZipInputStream fundamentally flawed?

steinybot
  • 5,491
  • 6
  • 37
  • 55

1 Answers1

2

I see no LSP violation. The documentation for ZipInputStream.read(byte[], int, int) says 'Reads from the current ZIP entry into an array of bytes'.

At any one time, the ZipInputStream is really the input stream of the entry, not the whole ZIP file. And it's hard to see what else ZipInputStream.read() could possibly do at end of entry other than return -1.

this will break the correctness property of any method that tries to use it in such a way

Hard to see how the method would ever know.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • This doesn't really answer the question. I was asking whether it is possible to write a method like this in which it can take an InputStream and generate a SHA1 from the entire stream regardless of what the actual implementation of the InputStream is. – steinybot Apr 07 '15 at 20:53
  • @Steiny And I've answered it. If you don't want the ZIPInputStream to unzip entries, don't use it. But that's what it does if you do use it. You seem to be expecting it to deliver the entire stream. It doesn't. – user207421 Aug 25 '15 at 10:18