4

Struggling to find anyone experiencing a similar issue or anything similar.

I'm currently consuming a stream over http (json) which has a GZip requirement, and I am experiencing a delay from when the data is sent, to when reader.ReadLine() reads it. It has been suggested to me that this could be related to the decoding keeping back data in a buffer?

This is what I have currently, it works fine apart from the delay.

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(endPoint);
request.Method = "GET";

request.PreAuthenticate = true;
request.Credentials = new NetworkCredential(username, password);

request.AutomaticDecompression = DecompressionMethods.GZip;
request.ContentType = "application/json";
request.Accept = "application/json";
request.Timeout = 30;
request.BeginGetResponse(AsyncCallback, request);

Then inside the AsyncCallback method I have:

HttpWebRequest request = result.AsyncState as HttpWebRequest;

using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{

    while (!reader.EndOfStream)
    {
        string line = reader.ReadLine();
        if (string.IsNullOrWhiteSpace(line)) continue;

        Console.WriteLine(line);
    }
}

It just sits on reader.Readline() until more data is received, and then even holds back some of that. There are also keep-alive newlines received, these are often are read out all at once when it does decide to read something.

I have tested the stream running side by side with a curl command running, the curl command receives and decompresses the data perfectly fine.

Any insight would be terrific. Thanks,

Dan

EDIT Had no luck using the buffer size on streamreader.

new StreamReader(stream, Encoding.UTF8, true, 1)

EDIT Also had no luck updating to .NET 4.5 and using

request.AllowReadStreamBuffering = false;
Dan Saltmer
  • 2,145
  • 13
  • 15
  • Hmm..why not use `reader.ReadToEnd()`? – JerKimball Feb 07 '13 at 20:37
  • It's a http stream, kept open over a very long period of time. So I need to handle each line as it comes in. I beleive .ReadToEnd() will just wait until EndOfStream is received? Which isn't likely to happen. – Dan Saltmer Feb 07 '13 at 20:39
  • Ah, so it's a keep-alive style connection where you'd get incremental responses back? – JerKimball Feb 07 '13 at 20:43
  • @Dan - try `request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;` and lmk! – Parimal Raj Feb 07 '13 at 21:10
  • @AppDeveloper no luck I'm afraid, I know it's explicitly using gzip though. Which I have a feeling is where the delay is being introduced. – Dan Saltmer Feb 07 '13 at 21:18
  • @DanSaltmer - any chance of sharing the url? – Parimal Raj Feb 07 '13 at 21:18
  • i have a question how does _stopStream come into play? – Parimal Raj Feb 07 '13 at 21:19
  • Afraid not, it's actually a GNIP stream. Which doesn't help I guess that I can't provide an example! _stopStream is just a bool that gets set when I call a method on the wrapper, always false, has no impact right now. – Dan Saltmer Feb 07 '13 at 21:23
  • @JerKimball has provided a sample code though to test with. Will update my question to show this. – Dan Saltmer Feb 07 '13 at 21:49
  • Thinking thru this more, this sounds like "expected behavior": if you've got a stream you are compressing, you can't effectively "chunk" arbitrary pieces of that stream and decompress them *in situ*...I don't know what `curl` does - maybe it's not actually sending the `Accept:gzip` header? – JerKimball Feb 07 '13 at 22:18
  • Yeah, I think that's whats happening: comment out the `AutomaticDecompression` line in that harness and it will respond as data is sent, but of course it's gobbledegook. – JerKimball Feb 07 '13 at 22:20
  • Yeah, I've always thought it would be tied to the compression. But without it isn't an option, it is a flat out 406 response if I don't accept it. – Dan Saltmer Feb 07 '13 at 22:24
  • @DanSaltmer Yeah, this is almost certainly whats going on - if you change that harness to read without decompression, you'll see that for each server message, it's only sending like 2-4 bytes incrementally...you're probably boned in this case. Sorry. :( – JerKimball Feb 07 '13 at 22:55
  • Haha, yeah I started looking at it like that. I'm currently working on something to remove the automatic decompression. See how I get one with that! Thanks for your help though. – Dan Saltmer Feb 07 '13 at 23:09
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/24180/discussion-between-dan-saltmer-and-jerkimball) – Dan Saltmer Feb 08 '13 at 10:06

3 Answers3

5

Update: This seems to have issues over long periods of time with higher rates of volume, and should only be used on small volume where the buffer is impacting the application's functionality. I have since switched back to a StreamReader.

So this is what I ended up coming up with. This works, without the delay. This does not get buffered by automated GZip decompression.

using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
using (Stream stream = response.GetResponseStream())
using (MemoryStream memory = new MemoryStream())
using (GZipStream gzip = new GZipStream(memory, CompressionMode.Decompress))
{
    byte[] compressedBuffer = new byte[8192];
    byte[] uncompressedBuffer = new byte[8192];
    List<byte> output = new List<byte>();

    while (stream.CanRead)
    {
        int readCount = stream.Read(compressedBuffer, 0, compressedBuffer.Length);

        memory.Write(compressedBuffer.Take(readCount).ToArray(), 0, readCount);
        memory.Position = 0;

        int uncompressedLength = gzip.Read(uncompressedBuffer, 0, uncompressedBuffer.Length);

        output.AddRange(uncompressedBuffer.Take(uncompressedLength));

        if (!output.Contains(0x0A)) continue;

        byte[] bytesToDecode = output.Take(output.LastIndexOf(0x0A) + 1).ToArray();
        string outputString = Encoding.UTF8.GetString(bytesToDecode);
        output.RemoveRange(0, bytesToDecode.Length);

        string[] lines = outputString.Split(new[] { Environment.NewLine }, new StringSplitOptions());
        for (int i = 0; i < (lines.Length - 1); i++)
        {
            Console.WriteLine(lines[i]);
        }

        memory.SetLength(0);
    }
}
Dan Saltmer
  • 2,145
  • 13
  • 15
1

There may be something to the Delayed ACK C.Evenhuis discusses, but I've got a weird gut feeling it's the StreamReader that's causing you headaches...you might try something like this:

public void AsyncCallback(IAsyncResult result)
{
    HttpWebRequest request = result.AsyncState as HttpWebRequest;   
    using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
    using (Stream stream = response.GetResponseStream())
    {
        var buffer = new byte[2048];
        while(stream.CanRead)
        {
            var readCount = stream.Read(buffer, 0, buffer.Length);
            var line = Encoding.UTF8.GetString(buffer.Take(readCount).ToArray());
            Console.WriteLine(line);
        }
    }
}

EDIT: Here's the full harness I used to test this theory (maybe the difference from your situation will jump out at you)

(LINQPad-ready)

void Main()
{
    Task.Factory.StartNew(() => Listener());
    _blocker.WaitOne();
    Request();
}

public bool _running;
public ManualResetEvent _blocker = new ManualResetEvent(false);

public void Listener()
{
    var listener = new HttpListener();
    listener.Prefixes.Add("http://localhost:8080/");
    listener.Start();
    "Listener is listening...".Dump();;
    _running = true;
    _blocker.Set();
    var ctx = listener.GetContext();
    "Listener got context".Dump();
    ctx.Response.KeepAlive = true;
    ctx.Response.ContentType = "application/json";
    var outputStream = ctx.Response.OutputStream;
    using(var zipStream = new GZipStream(outputStream, CompressionMode.Compress))
    using(var writer = new StreamWriter(outputStream))
    {
        var lineCount = 0;
        while(_running && lineCount++ < 10)
        {
            writer.WriteLine("{ \"foo\": \"bar\"}");
            "Listener wrote line, taking a nap...".Dump();
            writer.Flush();
            Thread.Sleep(1000);
        }
    }
    listener.Stop();
}

public void Request()
{
    var endPoint = "http://localhost:8080";
    var username = "";
    var password = "";
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(endPoint);
    request.Method = "GET";

    request.PreAuthenticate = true;
    request.Credentials = new NetworkCredential(username, password);

    request.AutomaticDecompression = DecompressionMethods.GZip;
    request.ContentType = "application/json";
    request.Accept = "application/json";
    request.Timeout = 30;
    request.BeginGetResponse(AsyncCallback, request);
}

public void AsyncCallback(IAsyncResult result)
{
    Console.WriteLine("In AsyncCallback");    
    HttpWebRequest request = result.AsyncState as HttpWebRequest;    
    using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
    using (Stream stream = response.GetResponseStream())
    {
        while(stream.CanRead)
        {
            var buffer = new byte[2048];
            var readCount = stream.Read(buffer, 0, buffer.Length);
            var line = Encoding.UTF8.GetString(buffer.Take(readCount).ToArray());
            Console.WriteLine("Reader got:" + line);
        }
    }
}

Output:

Listener is listening...
Listener got context
Listener wrote line, taking a nap...
In AsyncCallback
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}
JerKimball
  • 16,584
  • 3
  • 43
  • 55
  • Thanks, but no luck. Same behavior, when the line came through it had multiple bits of data in it which had been waited. I think the issue lies with the gzip somewhere. – Dan Saltmer Feb 07 '13 at 21:20
  • Hmm...odd, I threw together a quick harness to test this and saw what I *think* you're after...let me append the full harness to my answer. – JerKimball Feb 07 '13 at 21:26
  • 1
    That's actually really useful, it replicates the issue I am having. With two minor changes to your code to fix it. Pass the GZipStream to the writer. And add ctx.Response.AddHeader("Content-Encoding", "gzip"); – Dan Saltmer Feb 07 '13 at 21:41
  • Hah - yeah, those two changes would make a difference... :) Let me give it a think, if I come up with anything I'll augment this answer. – JerKimball Feb 07 '13 at 21:47
  • This is the output using your listener and your code. http://screencast.com/t/Po5WbK4eVw1 – Dan Saltmer Feb 07 '13 at 21:48
  • @DanSaltmer I'm assuming you have no control over the server side, right? – JerKimball Feb 07 '13 at 21:52
  • Correct, I think I said somewhere, this is actually a stream provided by GNIP. – Dan Saltmer Feb 07 '13 at 21:57
0

This may have to do with Delayed ACK in combination with Nagle's algorithm. It occurs when the server sends multiple small responses in a row.

On the server side, the first response is sent, but subsequent response data chunks are only sent when the server has received an ACK from the client, or until there is enough data for a big packet to send (Nagle's algorithm).

On the client side, the first bit of response is received, but the ACK is not sent immediately - since traditional applications have a request-response-request-response behavior, it assumes it can send the ACK along with the next request - which in your case does not happen.

After a fixed amount of time (500ms?) it decides to send the ACK anyway, causing the server to send the next packages it has accumulated sofar.

The problem (if this is indeed the problem you're experiencing) can be fixed on the server side at the socket level by setting the NoDelay property, disabling Nagle's algorithm. I think you can also disable it operating system wide.

You could also temporarily disable Delayed ACK (I know windows has a registry entry for it) on the client side to see if this is indeed the problem, without having to change anything on your server. Delayed ACK prevents DDOS attacks, so make sure you restore the setting afterwards.

Sending keepalives less frequently may also help, but you'll still have a chance for the problem to occur.

C.Evenhuis
  • 25,996
  • 2
  • 58
  • 72
  • Thanks for the response but this doesn't seem like the issue. I also don't have access to the server. The server is proven to work for other languages, and I have no issues running the connection in curl, that receives every line as it comes. Disabling Nagle's locally has no effect. I'm going to investigate an approach without HttpWebRequest for the time being though. – Dan Saltmer Feb 07 '13 at 21:06
  • If this was the problem, it would be the server's Nagle causing the issues. Too bad I coulnd't help you out, good luck. – C.Evenhuis Feb 07 '13 at 21:07