0

So i'm making a program (for own purposes) with HtmlAgilityPack in C# that at a certain point loads a webpage. after loading lots of pages, i get this error:

Unhandled Exception: System.IO.IOException: Unable to read data from the transpo
rt connection: An existing connection was forcibly closed by the remote host. --
-> System.Net.Sockets.SocketException: An existing connection was forcibly close
d by the remote host
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size,
 SocketFlags socketFlags)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 s
ize)
   --- End of inner exception stack trace ---
   at System.Net.ConnectStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.IO.StreamReader.ReadBuffer()
   at System.IO.StreamReader.ReadToEnd()
   at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in d:\Source\htmlagil
itypack.new\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 612
   at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocum
ent doc, IWebProxy proxy, ICredentials creds) in d:\Source\htmlagilitypack.new\T
runk\HtmlAgilityPack\HtmlWeb.cs:line 1422
   at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, Ne
tworkCredential creds) in d:\Source\htmlagilitypack.new\Trunk\HtmlAgilityPack\Ht
mlWeb.cs:line 1479
   at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in d:\Source\htmla
gilitypack.new\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1103
   at HtmlAgilityPack.HtmlWeb.Load(String url) in d:\Source\htmlagilitypack.new\
Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1061
   at ConsoleApplication1.Program.Main(String[] args) in 
c:\Users\...ConsoleApplication1\Program.c
s:line 37

At line 37 i'm loading a page inside a forloop:

for (var i = 0; i< 5000; i++)
    var page = web.Load(url+Convert.ToString(i+1)+"/");

I have tried to do some research on the error, but there wasn't a lot of in formation out there.

tshepang
  • 12,111
  • 21
  • 91
  • 136
breght
  • 57
  • 10
  • 2
    This has nothing to do with the Html Agility Pack library. The error comes from the HTTP/TCP/Socket layers. It just means the server either has a problem or just refuses your calls. – Simon Mourier Aug 26 '13 at 09:07
  • Ok, thank you for your answer, but how can i resolve this error? – breght Aug 26 '13 at 09:09
  • It can be caused by many things. If you don't own the server, you can't really know. They may detect you as a hacker for example. – Simon Mourier Aug 26 '13 at 09:49

1 Answers1

0

I got the same error after downloading some 1000+ webpages. Solved it with an extra catch regarding IOException, in the loop. Here is my code:

HtmlWeb web = new HtmlWeb();
web.PreRequest = delegate(HttpWebRequest webRequest)
{
   webRequest.Timeout = 15000;
   return true;
};

try { doc = web.Load(yUrl); }
catch (WebException ex)
{
    reTryCounter++;
    if (reTryCounter == 19) { MessageBox.Show("Error Program 1121 , Download webpage \n" + ex.ToString());  }
}
catch (IOException ex2)
{
    MessageBox.Show("Error Program 1125 , IOException Download webpage \n" + ex2.ToString());
    return null;
}
Joe
  • 1
  • 1