0

i want to run about 10.000 concurrent requests using .net HttpWebRequest, not all of them going to the same host and some of them go through a pool of proxies.

Im currently using Threads which works fine up to 1000 concurrent requests(equals around 3% in task manager), but when I scale up to 2000 or even 5000 concurrent requests, I get many Exceptions(see below) and 100% cpu load.
When I noticed that I thought its too much for the server, but running for example 2 instances with 1000 concurrent works also, so Im guessing its something with the connection management.

First of all: sample code of request(guessing its all fine, works fine until big scale)

public string HttpGet(string url)
{
    try {
        HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
        request.Timeout = 20000;
        request.CookieContainer = Cookie;
        request.AutomaticDecompression = DecompressionMethods.GZip;
        request.KeepAlive = true;
        request.Method = "GET";
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        Stream dataStream = response.GetResponseStream();
        StreamReader reader = new StreamReader(dataStream);
        string tmp = reader.ReadToEnd();
        reader.Close();
        dataStream.Close();
        response.Close();

        return tmp;
    } catch (Exception ex) {
        return "";
    }
}

And Yeah of course I (think I ) also set ServicePointManager right:

ServicePointManager.DefaultConnectionLimit = 20 * 1000;
ServicePointManager.MaxServicePointIdleTime = 1000 * 60 * 20;//maybe this is an issue?
ServicePointManager.UseNagleAlgorithm = false;
ServicePointManager.Expect100Continue = false;

The strange thing is that Im getting Timeouts(no real timeout, its an timeout like it occurs when you have a low DefaultConnectionLimit and trying to make requests parallel) even when the application even doesnt use the half of the limit of connections.

So I decided to try using the ConnectionGroupName to give each thread an unique connection:

request.ConnectionGroupName = randomStringPerThreadBasis;

That increased at least the opened TCP connections, so I not only extended the .NET DefaultConnectionLimit, I also increased the available dynamic TCP ports under windows like this http://kb.globalscape.com/KnowledgebaseArticle10438.aspx and this netsh int ipv4 set dynamicport tcp start=1025 num=50000

The strange thing again is that the app opens now so many connections, which arent needed, but still has a high cpu footage and still times out rarely(but much better than before, so its maybe something with the connection management?).


For completeness: Im also getting these exceptions somtimes(but not so often):

System.Net.WebException: The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. ---> System.IO.IOException:Unable to read data from the transport connection : An existing connection was forcibly closed by the remote host. ---> A connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed because 
connected host has failed to respond
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.Security._SslStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security._SslStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security._SslStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.TlsStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)
   at System.Net.HttpWebRequest.GetResponse()

So my question is how can I solve this? Did I sth wrong? Are there workarounds? Can I manage the connections on my own? Is there another class/language which just works for this case? etc.

Tearsdontfalls
  • 767
  • 2
  • 13
  • 32
  • Have you tried to reduce the amount of calls such as batch processing? Or is this not an option? (on a side note: don't ask for a library or your question might be flagged) – Quality Catalyst Feb 26 '15 at 01:16
  • Isnt an option(relying on a stupid API) – Tearsdontfalls Feb 26 '15 at 01:29
  • 1
    Yes, you are doing something wrong - exactly: ddos'ing some of the servers. To defend itself servers will do throttling, so I guess you are hitting throttling limits, not the resource limits. Couple ideas: 1) add throttling on a client side per server. 2) Add retry logic for transient faults. And of course, you don't want to spawn ~10k of threads - use .NET Task and *Async version of request, to reduce CPU usage and expensive threads. The reason two processes work - might be in the fact that you are opening connections at different times (first 1000, when next 1000) – Mikl X Feb 26 '15 at 01:42
  • May I ask why you are trying to open that many simultaneous connections? It is highly unlikely that you have enough bandwidth to actually have a faster transfer than a properly queued mechanism with lower parallelism. – Tim Feb 26 '15 at 01:52
  • @Tim Because the host doesnt support sth like Spdy vor Http/2.0 and I need to send many concurrent requests, i will try out async – Tearsdontfalls Feb 26 '15 at 14:28
  • I agree with the others. You need to constrain your outstanding requests. Only allow N concurrent requests at a time, and tweak N till you get a good balance of throughput without killing your machine or the target. Also, your CPU usage might not have anything to do with connection management. Have you profiled your app to see where the CPU is being used? – feroze Feb 27 '15 at 01:27

0 Answers0