0

I'm trying to validate URLs from Multiple threads and update a DataTable. The validation works fine when a single thread is used

Works Fine--Single Thread

foreach (string url in urllist)
{
Boolean valid = CheckURL(url);

this.Invoke((MethodInvoker)delegate()
{
if (valid)
{

dt.Rows[counter][2] = "Valid";
validcount++;
}
else
{

dt.Rows[counter][2] = statusCode;
invalidcount++;

}
counter++;

});
}

But when i try to do this using multiple threads Some Valid URls are reported as Invalid and vice versa.

Multi-Threads -Not Working

Parallel.ForEach(urllist, ProcessUrl);

private void ProcessUrl(string url)
        {
            Boolean valid = CheckURL(url);

            this.Invoke((MethodInvoker)delegate()
            {
            if (valid)
            {

                dt.Rows[counter][2] = "Valid";
                validcount++;


            }
            else
            {

                dt.Rows[counter][2] = statusCode;
                invalidcount++;

            }

                counter++;

            });
        }

Associated Method and Class

 private  Boolean CheckURL(string url)
        {
 using (MyClient myclient = new MyClient())
            {
                try
                {
                    myclient.HeadOnly = true;
                    myclient.Headers.Add(HttpRequestHeader.UserAgent, "My app.");
                     //fine, no content downloaded
                    string s1 = myclient.DownloadString(url);
                    statusCode = null;
                    return true;
                }
                catch (WebException error)
                {
                    if (error.Response != null)
                    {
                        HttpStatusCode scode = ((HttpWebResponse)error.Response).StatusCode;
                        if (scode != null)
                        {
                            statusCode = scode.ToString();
                        }
                    }
                    else
                    {
                        statusCode = "Unknown Error"; 
                    }
                    return false;
                }
            }

        }



class MyClient : WebClient
        {
            public bool HeadOnly { get; set; }
            protected override WebRequest GetWebRequest(Uri address)
            {
                WebRequest req = base.GetWebRequest(address);

               req.Timeout = 10000;
                if (HeadOnly && req.Method == "GET")
                {
                    req.Method = "HEAD";
                }
                return req;
            }
        }

What i'm i doing wrong ? Please advice..

UPDATE:

How i start the task -->

var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token; 
Task.Factory.StartNew(() =>
{
 if (nameresfailcount > 10)
    {
    if (ct.IsCancellationRequested)
    {
    // another thread decided to cancel
    Console.WriteLine("task canceled");
    break;
    }
    }
//stuff 
},ct).ContinueWith(task =>
{
_benchmark.Stop();
}
techno
  • 6,100
  • 16
  • 86
  • 192
  • what method are you executing in `this.Invoke((MethodInvoker)delegate()`? – Jeric Cruz Jul 20 '17 at 03:57
  • @JericCruz I was updating datagridview there .. now a binding source is used .. so replaced with datatable. – techno Jul 20 '17 at 04:11
  • I think the "this" references the wrong object. Can you try the following: dataGridViewResults.BeginInvoke(new Action(() => { //make changes })); – netblognet Jul 20 '17 at 08:38
  • @netblognet Thanks a lot for your reply.I have removed BeginInvoke completely as `dt` is a `datatable` not a `datagridview`,still the same issue.. i think the index(counter) might be causing the issue. – techno Jul 20 '17 at 08:47
  • But a change in the datatable reflects a change in the datagrid. I'm not sure, but I think this isn't thread-safe. So try to enclose the code part where you change the datatable with the begininvoke code from the datagrid. – netblognet Jul 20 '17 at 08:49
  • @netblognet Tried it,same issue persists.When i check the same URLs with your 404 Checker,it works perfect. – techno Jul 20 '17 at 09:01
  • @netblognet You are implementing a Queue and Timer... is it necessary ? – techno Jul 20 '17 at 09:02
  • I used a Queue for the work items and then used the Task.Taskfactory for the parallel processing. – netblognet Jul 20 '17 at 09:04
  • @netblognet What do you think my issue is.. Any Idea? – techno Jul 20 '17 at 09:05
  • No idea. Can you give a more detailed error description? – netblognet Jul 20 '17 at 09:06
  • @netblognet There is no error but some Valid links are reported as Invalid .. 404 for Valid URLs etc.. i guess there is some problem in the `counter` variable.. – techno Jul 20 '17 at 09:12
  • Hm or your WebClient isn't configured well. But you should be able to figure this out by debugging the code. – netblognet Jul 20 '17 at 09:25
  • @netblognet okay.. will try pointing out the error. – techno Jul 20 '17 at 09:31
  • @netblognet I will adapt your project and work on that.. hope that's okay.. I'm fed up with this.. – techno Jul 21 '17 at 04:02
  • No problem. Feel free to re-use my code. If you should publish your software, drop me link via mail/message. I'm interested to see how you modified it. – netblognet Jul 21 '17 at 04:44
  • @netblognet Thanks a lot :) , Sure,if it succeeds i will send you something as well.Hope you will take it. – techno Jul 21 '17 at 04:54
  • @netblognet Just one question.What does "Bypass Windows Hosts File Option" do? Does it use different DNS Server eg:Google's DNS? There is no option provided for user to provide input.Please clarify. – techno Jul 21 '17 at 07:33
  • This option uses https://www.nuget.org/packages/ARSoft.Tools.Net as DNS resolver. Some users had problems with the Windows native DNS resolver, so that the tool showed errors, even if the servers were reachable. Therefore I implemented this alternative DNS resolver. – netblognet Jul 21 '17 at 08:24
  • @netblognet Thanks for the reply.. So basically it uses another alternative DNS Resolver if the NameServers are not returned correctly by windows DNS Resolver.Do you remember under what scenario such errors occured. – techno Jul 21 '17 at 11:24
  • @netblognet How do you suggest to implement cancellation if the Internet is disconnected while checking the URLs.I currently check for 10 `NameResoultionFailure` responses and try to cancel the task by using a cancellation token.. but i don't know how to cancel it as it throws `Error Control cannot leave the body of an anonymous method or lambda expression` .Please see my update. – techno Jul 27 '17 at 07:47
  • @netblognet How are you? I have been testing your code.It seems the application gets stuck or unresponsive when large number of URLs (20,000-40,000) is used.Why do you think it happens? Please advice... – techno Nov 27 '17 at 05:37

0 Answers0