0

My problem is a bit complex and i don't know how to explain it best

I have a 125k public proxy list. As you can also guess, most of them are invalid

I would like to quickly test all of them

So i have written an application which spawns 250 concurrent fetching tasks by Parallel.ForEach

So i am constantly fetching the same page by 250 different proxies to see whether they work or not

Each task uses one of the proxies and fetches the same page and look at the source code

Whether source code is valid, it returns true or false

I have set maximum allowed concurrent connection count per host to 1k

ServicePointManager.DefaultConnectionLimit = 1000;
//This sets to maximum number of concurrent connections to same host

So after few minutes fetching started, i am not able to connect any pages through my web browser. It shows in the below bar as resolving host.

I am not sure how to debug what is the exact problem?

I am suspicious that somehow DNS resolving is get broken. Or some other error happens. Any ideas are welcomed

My working environment is : c# .NET 4.6.2, Windows 8.1, 25 MBs fiber connection

I am using the below fetching function

public static cs_HttpFetchResults func_fetch_Page(string srUrl, int irTimeOut = 60, 
string srRequestUserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0", 
string srProxy = null, 
int irCustomEncoding = 0, 
bool blAutoDecode = true, bool blKeepAlive = true) {
        cs_HttpFetchResults mycs_HttpFetchResults = new cs_HttpFetchResults();
        mycs_HttpFetchResults.srFetchingFinalURL = srUrl;

        try {
            HttpWebRequest request = (HttpWebRequest) WebRequest.Create(srUrl);
            request.CookieContainer = new System.Net.CookieContainer();

            if (srProxy != null) {
                string srProxyHost = srProxy.Split(':')[0];
                int irProxyPort = Int32.Parse(srProxy.Split(':')[1]);
                System.Net.WebProxy my_awesomeproxy = new WebProxy(srProxyHost, irProxyPort);
                my_awesomeproxy.Credentials = new NetworkCredential();
                request.Proxy = my_awesomeproxy;
            }
            else {
                request.Proxy = null;
            }

            request.Timeout = irTimeOut * 1000;
            request.UserAgent = srRequestUserAgent;
            request.KeepAlive = blKeepAlive;
            request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";

            WebHeaderCollection myWebHeaderCollection = request.Headers;
            myWebHeaderCollection.Add("Accept-Language", "en-gb,en;q=0.5");
            myWebHeaderCollection.Add("Accept-Encoding", "gzip, deflate");

            request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;

            using(WebResponse response = request.GetResponse()) {
                using(Stream strumien = response.GetResponseStream()) {

                    Encoding myEncoding = Encoding.UTF8;

                    using(StreamReader sr = new StreamReader(strumien, myEncoding)) {
                        mycs_HttpFetchResults.srFetchBody = sr.ReadToEnd();
                        mycs_HttpFetchResults.blResultSuccess = true;
                    }
                }
            }
        }
        catch {

    }

        return mycs_HttpFetchResults;
    }

After a while i have noticed that service host local system becomes a lot bigger in terms of ram usage

enter image description here

Furkan Gözükara
  • 22,964
  • 77
  • 205
  • 342
  • I think the DNS part is an XY problem. A web request starts by a DNS lookup, which requires an available network connection. If the connection is saturated, no more requests can be made, including DNS resolve requests. – CodeCaster Mar 20 '17 at 11:06
  • There a finite number of tcp requests you can make concurrently, your DNS server could be deciding you're trying to DDOS it, or, you've reached your tcp limit.. As the description suggests you're reading 1000 pages per proxy you try? – BugFinder Mar 20 '17 at 11:09
  • @BugFinder no. i mean i am requesting same page by 250 different proxies to test whether proxy works or not. I am using open DNS and Google DNS. How can i determine what is the problem? DNS server blocks me? – Furkan Gözükara Mar 20 '17 at 11:37
  • @CodeCaster any idea how can i debug the cause? I mean this happens when i test proxies. But if use 250 concurrent connections without using proxies, the system works fine. – Furkan Gözükara Mar 20 '17 at 11:39
  • @MonsterMMORPG Have you considered using Proxicity.io for this? They have an API that serves public proxies that have been checked and verified. They remove the old, broken proxies and add new ones constantly. Check it out - https://www.proxicity.io – cmeadows Mar 31 '17 at 16:12

0 Answers0