2

I am using SequenceInputStream to merge multiple streams into a single stream. I am on JDK8. Following is the code.

private InputStream mergeInputStreams(final Map<String, InputStream> fileAssets, final JSONObject json) throws Exception {

    final List<InputStream> listStreams = new ArrayList<InputStream>();

    listStreams.add(stringToStream(HEADER));
    addToList(json, listStreams);

    listStreams.add(stringToStream(HEADER_2));
    addToList(fileAssets.get(FILE_2), listStreams, true);

    listStreams.add(stringToStream(HEADER_3));
    addToList(fileAssets.get(FILE_3), listStreams, false);

    return new SequenceInputStream(Collections.enumeration(listStreams));
}

private void addToList(final InputStream inputStream, List<InputStream> listStreams, final boolean delimiter) throws Exception {
    final byte[] input = byteArrayFromStream(inputStream);
    listStreams.add(intToStream(input.length));
    listStreams.add(new ByteArrayInputStream(input));
    if (delimiter) {
        listStreams.add(stringToStream("\n"));
    }
}

private void addToList(final JSONObject json, final List<InputStream> listStreams) throws Exception {
    final String jsonString = json.toString();
    listStreams.add(intToStream(jsonString.length()));
    listStreams.add(stringToStream(jsonString));
}

The issue I am having is, I always get the first stream from SequenceInputStream object i.e. I just get the HEADER string. I've tried several options, including

new SequenceInputStream(listStreams.get(9), listStreams.get(9)); 

In the above example, I am trying to merge the same input twice. However, I still get the 9th input stream only once.

I have verified that I do get multiple streams in the enumeration.

It would be great if someone could help me understand what's going on here.

Abhishek
  • 323
  • 1
  • 3
  • 8

2 Answers2

2

It will read the first stream until end of stream, then the second, and so on. Possibly that isn't what you're expecting? That also means you can't supply the same stream twice, as it will already have been read completely on the first usage.

I fail to see what constructors have to do with it.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Yes. I believe it's doing so. I was just experimenting with available. It looks like it returns available stream by stream i.e. if we have 3 streams, first time it will return available() count of only first stream. And it looks like the downstream of my code that's consuming InputReader object is directly calling available() to fetching those many bytes. This seems wrong to me. I will provide an update as I find more. – Abhishek Apr 28 '15 at 08:39
  • 1
    It isn't even obliged to do that. If you're expecting it to return the total length of all the streams, that's a misuse which is specifically warned against in the Javadoc. There are very few correct uses of `available().` Don't use it. – user207421 Apr 28 '15 at 08:43
  • I agree. I happened to check the downstream library that's using this. Let me propose modifying it so that it doesn't use available(). Thanks. – Abhishek Apr 28 '15 at 09:12
0

Here's what we have:

  1. Create a sequenceinputstream object 's' from 'n' streams
  2. upload 's' to S3 using an external library. lib.uploadToS3(s)

The issue: The third party library uploadToS3(stream) call was using stream.available() to initialize the buffer array and it was filling it from the stream and uploading.

It looks like SequenceInputStream.available() (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/io/SequenceInputStream.java) returns available() from the current stream its iterating over. e.g. in the context of the lib.uploadToS3(), it was using available() from the very first stream in the sequence.

What we fixed: We fixed the library to use IOUtils.copy() instead of writing the copy code that relies on available().

Abhishek
  • 323
  • 1
  • 3
  • 8
  • 1
    Well done. I continue to find it incredible that people will use `available()` for the very thing tthe Javadoc tells you not to use it for. – user207421 Apr 29 '15 at 22:08
  • I will further note that expecting `available()` to return the sum of the underlying streams' `available()` imethods is irrational. If the first stream has *M* bytes 'available' without blocking and a further *N* bytes readable with blocking, it is totally irrelevant what can be read from the other streams without blocking, as the *N* bytes have to get read first. – user207421 Dec 03 '19 at 11:29