5

I'm using Java's BufferedInputStream class to read bytes sent to a socket. The data to the socket is HTTP form so generally is a header with defined content-length, then some content.

The problem I'm having is that sometimes BufferedInputStream.read() will not read the full amount of data sent to it. It returns the number of bytes read but this is much less than has been sent. I have verified the bytes sent with Wireshark and can confirm the full message is being transmitted.)

Sample code below:

BufferedInputStream inFromClient = new BufferedInputStream(socket.getInputStream());
int contentLength = getContentLengthFromHeader();    
byte[] b = new byte[contentLength];
int bytesRead = inFromClient.read(b, 0, contentLength);

Once read() is finished sometimes bytesRead is equal to contentLength but on other occasions read() does not seem to read as far as the end of the content. Does anyone have any ideas on what is happening? Is Java buffering output? Are there better ways of reading from sockets?

Andrew Thompson
  • 168,117
  • 40
  • 217
  • 433
Adam
  • 532
  • 3
  • 11
  • 22

2 Answers2

1

You're assuming that read() fills the buffer. Check the Javadoc. It transfers at least one byte, that's all it says.

You don't need both a large buffer and a BufferedInputStream. Change the latter to DataInputStream.readFully().

user207421
  • 305,947
  • 44
  • 307
  • 483
0

This is normal behavior for the read() method: you need to keep reading in a loop until read returns -1. (see http://docs.oracle.com/javase/7/docs/api/java/io/BufferedInputStream.html#read(byte[],%20int,%20int))

In general, it happens because the read method is trying to return all the data it can to you before blocking, not all the data you will ever get.

There are a couple of utility methods I frequently use for this sort of thing: (snipped out of context - note that I am not the author of the channelCopy method, but the source is attributed)

   /**
    * Efficiently copy from an InputStream to an OutputStream; uses channels and 
    * direct buffering for a faster copy than oldCopy. 
    * @param in - non-null readable inputstream
    * @param out - non-null writeable outputstream
    * @throws IOException if unable to read or write for some reason.
    */
   public static void streamCopy(InputStream in, OutputStream out) throws IOException {
      assert (in != null);
      assert (out != null);
      ReadableByteChannel inChannel = Channels.newChannel(in);
      WritableByteChannel outChannel = Channels.newChannel(out);
      channelCopy(inChannel, outChannel);
   }

   /**
    * Read the *BINARY* data from an InputStream into an array of bytes. Don't 
    * use this for text.
    * @param is - non-null InputStream
    * @return a byte array with the all the bytes provided by the InputStream 
    * until it reaches EOF.
    * @throws IOException 
    */
   public static byte[] getBytes(InputStream is) throws IOException{
      ByteArrayOutputStream os = new ByteArrayOutputStream();
      streamCopy(is, os);
      return os.toByteArray();
   }


   /**
    * A fast method to copy bytes from one channel to another; uses direct 16k 
    * buffers to minimize copies and OS overhead.
    * @author http://thomaswabner.wordpress.com/2007/10/09/fast-stream-copy-using-javanio-channels/
    * @param src - a non-null readable bytechannel to read the data from
    * @param dest - a non-null writeable byte channel to write the data to
    */   
   public static void channelCopy(final ReadableByteChannel src, final WritableByteChannel dest) throws IOException {
      assert (src != null);
      assert (dest != null);
      final ByteBuffer buffer = ByteBuffer.allocateDirect(16 * 1024);
      while (src.read(buffer) != -1) {
         // prepare the buffer to be drained
         buffer.flip();
         // write to the channel, may block
         dest.write(buffer);
         // If partial transfer, shift remainder down
         // If buffer is empty, same as doing clear()
         buffer.compact();
      }

      // EOF will leave buffer in fill state
      buffer.flip();

      // make sure the buffer is fully drained.
      while (buffer.hasRemaining()) {
         dest.write(buffer);
      }
  }
JVMATL
  • 2,064
  • 15
  • 25
  • The `channelCopy()` method's loop should be `while (src.read(buffer) > 0 || buffer.position() > 0).` You can then get rid of the stuff at the end that finishes up at EOF. – user207421 Jan 14 '14 at 01:01
  • Hi @ejp, thanks for commenting; I expect you're probably right, but I don't quite get it - can you please elaborate? If I replace the `while (src.read(buffer) != -1)` with what you suggest, it seems to me that the copy operation might terminate early if, for example, an http transfer stalls temporarily (read could return 0 because no new data is available, and we could finish draining the buffer we have, leaving it at position 0,) but it would be wrong to terminate the loop at this time (because EOF had not been reached) - what am I missing here? – JVMATL Jan 14 '14 at 14:52
  • You're missing the part after the `||.` – user207421 Jan 16 '14 at 10:17
  • @ejp I really want to get this function right, but I don't think you read until the end of my comment? IF read() return 0 (no data currently available) AND the buffer.position() returns 0 (we have drained all data from the buffer into the OutputStream), your while loop condition is false and the loop will exit. BUT, unless I am misreading the specs, that condition CAN happen before EOF during a stalled http transfer (the javadoc says read() CAN return 0, and an empty buffer has position==0,) and it will cause copy to exit prematurely. Your reply doesn't seem to address the crux of my question. – JVMATL Jan 16 '14 at 13:41
  • Sorry for the crosstalk, @adam, I'm moving this discussion into my own question… :) – JVMATL Jan 16 '14 at 14:23
  • @ejp I think my question hinges on whether my underlying assumption, that the read() method might return 0 when the transfer stalls, is valid or not: the fact that you changed the code to test for `> 0` rather than `!= -1` suggests to me that you think it might. I invite you to consider my question at http://stackoverflow.com/questions/21166196/can-bufferedinputstream-readbyte-b-int-off-int-len-ever-return-0-are-ther – JVMATL Jan 16 '14 at 15:41