0

My program uses WebRequest and WebResponse in order to download a HTML file from a given URL in an iteration. For example, the URL string will look something like

http://www.aaaa.com/cccc=varB

where varB is a different string for each iteration through the loop.

After it downloads the file into a stream, it would search the stream for specific strings of text and store them into a separate text file. However, I found that on some iterations it doesn't seem to be reading anything (the URL string for it is valid when I type it into the address bar, so it's not an invalid URL).

I put the streams and WebResponse objects in using blocks, and I also have a try…catch block, but no exception occurs. Is using WebRequest and WebResponse problematic within loops?

try
{
    foreach (string name in names)
    {
        string urlstr = "…"; // URL format like I mentioned earlier

        HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(urlstr);
        myRequest.Timeout = 30000;

        //store the response in myResponse 
        using (HttpWebResponse myResponse = (HttpWebResponse)myRequest.GetResponse())
        {
            //register I/O stream associated with myResponse
            using (Stream myStream = myResponse.GetResponseStream())
            {
                //create StreamReader that reads characters one at a time
                using (StreamReader myReader = new StreamReader(myStream))
                {
                    myReader.ReadLine();
                    sw.WriteLine(name + " " + myReader.ReadLine());
                }
            }
        }
    }

    sw.Close();
}

The result will look similar to this:

name1 stuffReadfromfile
name2 stuffReadfromfile
name3 stuffReadfromfile
name4                        
name5 stuffReadfromfile
name6 
name7 stuffReadfromfile
name8 stuffReadfromfile
name9 
name10 stuffReadfromfile

even though there should be stuffReadfromfile after each name.

stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268
ShadowCrossZero
  • 379
  • 2
  • 6
  • 16

2 Answers2

1

Two things here:

First : Try to read entire response in a string and then process the string using ReadToEnd():

//create StreamReader that reads characters one at a time
using (StreamReader myReader = new StreamReader(myStream))
{
    string content = myReader.ReadToEnd();
    // Process content
}

And second thing: Try to set request.CachePolicy so you are always sure that you get the latest content from server.

I agree to the above comment about checking the status code before you do anything with content.

Hope that helps

Digvijay
  • 361
  • 1
  • 9
  • The status code is ok for each iteration, and I also set teh CachePolicy, but after some more experimenting I found out that the site I was making queries to blocks if there's too many requests from the same computer or IP within a short time span. On the plus side I learned what StatusCode and CachePolicy is though. – ShadowCrossZero Feb 11 '12 at 19:15
0

I would use something like Fiddler to know what is actually going on - whether data you're expecting is returned from the server. BTW, why are you calling ReadLine() twice - cannot the first call swallow your data in some cases?

pavel.baravik
  • 689
  • 1
  • 11
  • 21