I'm under the impression that a ByteArrayOutputStream
is not memory efficient, since all it's contents are stored in memory.
Similarly, calling toByteArray
on a large stream seems like it "scales poorly".
Why, then, in the example in The example in Tom White's book Hadoop: the Definitive Guide use them both:
ByteArrayOutputStream out = new ByteArrayOutputStream;
Decoder decoder = DecoderFactory().defaultFactory().createBinaryDecoder(out.toByteArray(), null);
Isn't "Big Data" the norm for Avro? What am I missing?
Edit 1: What I'm trying to do - Say I'm streaming avros over a websocket. What would the example look like if I wanted to deserialize multiple records, not just one that was put in it's own ByteArrayOutoputStream
?
Is there a better way to supply BinaryDecoder
with a byte[]? Or perhaps a different type of stream? Or I should be sending 1 record per stream instead of loading streams with multiple records?