Why does NetworkStream Read like this?

Question

I have an application that sends messages that are newline terminated over a TCP socket using TCPClient and it's underlying NetworkStream.

The data is streaming in at roughly 28k every 100ms from a realtime data stream for monitoring.

I've stripped out the irrelevant code, this is basically how we read the data:

TcpClient socket; // initialized elsewhere
byte[] bigbuffer = new byte[0x1000000];
socket.ReceiveBufferSize = 0x1000000;
NetworkStream ns = socket.GetStream();
int end = 0;
int sizeToRead = 0x1000000;
while (true)
{
  bytesRead = ns.Read(bigbuffer, end, sizeToRead);
  sizeToRead -= bytesRead;
  end += bytesRead;

  // check for newline in read buffer, and if found, slice it up, and return
  // data for deserialization in another thread

  // circular buffer
  if (sizeToRead == 0)
  {
    sizeToRead = 0x1000000;
    end = 0;
  }
}

The symptom we were seeing, somewhat intermittently based on the amount of data we were sending back, is that there would be a 'lag' of records, where the data we're reading from the stream get progressively older and older versus what we are delivering (after a few minutes of streaming, the lag is in order of 10s of seconds), until eventually it all catches up in one big shot, and the cycle repeats.

We fixed it by maxing out sizeToRead, and (whether or not this is required, I'm not sure, but we did it anyway), removed the ReceiveBufferSize set on TcpClient and kept it at the default 8192 (changing just ReceiveBufferSize did not correct it).

int sizeForThisRead = sizeToRead > 8192 ? 8192 : sizeToRead;
bytesRead = ns.Read(bigBuffer, end, sizeForThisRead);

I thought maybe it was the interaction with nagle and delayed ack, but wireshark showed that data was coming in just fine based on the timestamps and looking into the data (which is timestamped, and the server and clients clocks are synchronized within a second).

We output logs after the ns.Read, and for sure the issue is at the Read call and not the deserialization code.

So what this leads me to believe is that if you set the TcpClient's ReceiveBufferSize really big, and in your Read call on it's underlying NetworkStream pass bytesToRead to be many more bytes than expected to arrive, there's a timeout going on at the Read call waiting for those bytes to arrive, but it still doesn't return everything in the stream? Each successive call in this loop is timing out, until the 1 meg buffer is full, after which when 'end' gets reset back to 0, it inhales everything that's left in the stream causing it all to catch up - but it shouldn't do that because the logic, to me, looks like it should empty the stream completely on the next iteration (because the next sizeToRead will still be > data available in the buffer).

Or maybe it's something I'm not thinking about that I can't synthesize - but perhaps those clever souls here may think of something.

Or maybe this is expected behaviour - if so, why?

Decreasing sizeToRead is a bug, it should always be equal to the buffer size. — Hans Passant, Apr 29 '14 at 17:19
Because it makes no sense whatsoever to read less than what can fit in the buffer you've allocated. And you in fact complain about the way it behaves, reading less and less until you suddenly reset it again. — Hans Passant, Apr 29 '14 at 19:35
Oh but it does make sense when you don't want to stomp on the data that came before it without copying it somewhere else first. I'll try changing this to always read a full buffer of 1 meg and copy off what is read to a new buffer after each Read (which is a copy that is not required with the design above, a minor optimization), and let you know if it helps. — paquetp, Apr 30 '14 at 02:04
Can you reproduce this in a short but complete program (well, pair of programs - one server, one client) so that we can test it for ourselves? — Jon Skeet, May 01 '14 at 23:02
Just a random thought. Did you take a look at how much GC is going on? Any correlation between GC kicking in and delays? — steve cook, May 03 '14 at 05:58
@JonSkeet - Well, I did what you asked, and tried to reproduce it on my local network. Of course, these stand alone applications do not exhibit the same symptom - so this makes me think it's related to something specific to the target I was running it on or something different I'm not thinking. The differences between standalone programs and above is the one generating the data (server) was from a unixlike (vxWorks) system - however, looking at the wireshark logs, the server was delivering the data in time. Target hardware was different (slower core), but CPU utilization was low. — paquetp, May 03 '14 at 13:17
The only other thing I can think is different between stand alone programs and target is the .net framework is 2.0 for target, and I'm running the 4 client profile on the stand alone programs. Also, the target runs on a lower priority thread (to not bog down GUI responsiveness because of the amount of data coming in), but it's only BelowNormal. — paquetp, May 03 '14 at 13:21
I can try running these stand alone programs on our target, and change the client to the vxWorks, but I don't want to spend too much time on this because it provides no benefit to our company, just my sanity :) — paquetp, May 03 '14 at 13:22
I'm curious about your target application. You have packet captures that indicate the network data is delivered in a timely manner, but you were nevertheless seeing a lag of records. Is it a UI application with the lag appearing visually in a real-time display? I ask because I already have a simple pair of producer/consumer apps each with a UI. The only way I have been able to approximate the lag you describe is when I marshall the data to the UI thread for display. — Joel Allison, May 04 '14 at 06:40
Yes, it is a ui application that our users use to monitor the data in realtime. The lag is definitely coming from the read call on the network stream based on the logging we put in while the lag was happening. Its not, for example, in the deserialization. — paquetp, May 04 '14 at 17:23

score 6 · Answer 1 · answered May 07 '14 at 01:05

This behavior was so interesting that I just had to see it for myself, and... I couldn't.

This anti-answer presents an alternative theory that may explain the lag described in the question. I had to infer some details from the question and comments.

The target application is an interactive UI application with three threads of operation:

A TcpClient network data consumer.
A data queue consumer thread that delivers results to the UI.
The UI thread.

For the purposes of this discussion, assume that TheDataQueue is a BlockingCollection<string> instance (any thread-safe queue would do):

BlockingCollection<string> TheDataQueue = new BlockingCollection<string>(1000);

The application has two synchronous operations that block while waiting for data. The first is the NetworkStream.Read call that is the primary subject of the question:

bytesRead = ns.Read(bigbuffer, end, sizeToRead);

The second blocking operation occurs when data in the work queue is marshalled to the UI for display. Let's suppose the code looks like this:

// A member method on the derived class of `System.Windows.Forms.Form` for the UI.
public void MarshallDataToUI()
{
    // Current thread: data queue consumer thread.
    // This call blocks if the data queue is empty.
    string text = TheDataQueue.Take();

    // Marshall the text to the UI thread.
    Invoke(new Action<string>(ReceiveText), text);
}

private void ReceiveText(string text)
{
    // Display the text.
    textBoxDataFeed.Text = text;

    // Explicitly process all Windows messages currently in the message queue to force
    // immediate UI refresh.  We want the UI to display the very latest data, right?
    // Note that this can be relatively slow...
    Application.DoEvents();
}

In this application design, the observed lag occurs when the network delivers data to TheWorkQueue faster than the UI can show it.

Why might @paquetp's logs show a problem with NetworkStream.Read?

NetworkStream.Read blocks until data is available. If the logs report the elapsed time while waiting for more data, then there will be an apparent delay. But the TcpClient network buffer is actually empty because the application has already read and queued the data. If the realtime data stream is bursty, then this will happen often.

How do you explain that eventually it all catches up in one big shot?

This is a natural consequence of the data queue consumer thread working through the backlog in TheDataQueue.

But what about the packet capture and data timestamps?

If an item is backlogged in TheDataQueue, the data timestamps are correct. But you can't see them yet in the UI. The packet capture timestamps are timely because the network data was received and queued quickly by the network thread.

Isn't this all just guesswork?

Nope. There is a pair of custom apps (producer and consumer) that demonstrate this behavior.

Network Consumer App Screenshot

The screenshot shows that the data queue is backlogged by 383 items. The data timestamp lags the current timestamp by about 41 seconds. I paused the producer a few times to simulate bursty network data.

However, I was never able to make NetworkStream.Read behave as the question supposed.

this is fantastic. However one question - wouldn't the CPU utilization be at 100% in this situation? — paquetp, May 07 '14 at 11:56
In these apps, the UI thread in the Consumer pushes the CPU pretty hard, but the rest of the pipeline is I/O bound. — Joel Allison, May 07 '14 at 13:47
So I put a bit of grey matter towards this and derived the following analysis: we output in the log not only the delay between reads but bytesRead as well, and the amount of data read is consistent with the lag, and when it "catches up", bytesRead is an order of magnitude higher for each read call and the delay is much smaller between read calls. The symptom I understand above is an artifact of not servicing what was read fast enough,and the log output would show in your app that you're reading all the data in a timely. Is this correct? — paquetp, May 07 '14 at 14:54
Yes to *not servicing fast enough*. With respect to logging, `NetworkStream.DataAvailable` can tell you whether whether data is ready to read from the network buffer. When false, you can expect that the next `Read` will block waiting for more data. Try to correlate this with your `Read` delays. — Joel Allison, May 07 '14 at 15:09
So because it's an artifact of not servicing the data queue fast enough, it should still service the Read fast enough and read everything in the networkstream in your app - does it not? If you output the bytesRead in your read call, does it correlate with the lag in the UI, like I described above? I ask because the CPU utilization has a significant amount of idle time (70% idle) when the program I'm describing experiences this behaviour. — paquetp, May 07 '14 at 15:45
Ah, ok, I get it now - I see in the UI you have a Queue count - and you mention its behind by 383 items. In our case, our data queue is empty when it's lagging (it has to be because we didn't read the data yet). — paquetp, May 07 '14 at 15:51
In my example apps, I can start with an empty data queue in the consumer, then burst some data and watch the queue fill very quickly. After that the consumer demonstrates a lag as it processes the data queue, while the network buffer is empty. — Joel Allison, May 07 '14 at 18:38
Your are welcome to have the sample apps if you think it would help your investigation. — Joel Allison, May 07 '14 at 18:39
Even though we didn't solve why my system is behaving this way, this answer was the most effort, so I granted this one the bounty. — paquetp, May 08 '14 at 22:57

madrang · Answer 2 · 2014-05-04T08:10:35.040

TcpClient.NoDelay Property gets or sets a value that disables a delay when send or receive buffers are not full.

When NoDelay is false, a TcpClient does not send a packet over the network until it has collected a significant amount of outgoing data. Because of the amount of overhead in a TCP segment, sending small amounts of data is inefficient. However, situations do exist where you need to send very small amounts of data or expect immediate responses from each packet you send. Your decision should weigh the relative importance of network efficiency versus application requirements.

Source: http://msdn.microsoft.com/en-us/library/system.net.sockets.tcpclient.nodelay(v=vs.110).aspx

Push Bit Interpretation By default, Windows Server 2003 TCP/IP completes a recv() call when one of the following conditions is met:

Data arrives with the PUSH bit set
The user recv buffer is full
0.5 seconds have elapsed since any data has arrived

If a client application is run on a computer with a TCP/IP implementation that does not set the push bit on send operations, response delays may result. It is best to correct this on the client; however, a configuration parameter (IgnorePushBitOnReceives) was added to Afd.sys to force it to treat all arriving packets as though the push bit were set.

Try to reduce the buffer size to force the vendor network implementation to set the PSH bit.

Source: http://technet.microsoft.com/en-us/library/cc758517(WS.10).aspx (under Push Bit Interpretation) Source: http://technet.microsoft.com/en-us/library/cc781532(WS.10).aspx (under IgnorePushBitOnReceives)

I'll check the pushbit in the Wireshark logs, and if not set, will roll back our fix, set the registry key and see if that removes the lag. If so, this answer will be accepted. It most certainly sounds like it would explain what I'm seeing. — paquetp, May 04 '14 at 17:25
alas - the push bit is being set by the server sending the data, so this can't be it. Good one though. — paquetp, May 05 '14 at 16:57

Why does NetworkStream Read like this?

2 Answers2