Webresponse/Webrequest not working

Question

My program uses WebRequest and WebResponse in order to download a HTML file from a given URL in an iteration. For example, the URL string will look something like

http://www.aaaa.com/cccc=varB

where varB is a different string for each iteration through the loop.

After it downloads the file into a stream, it would search the stream for specific strings of text and store them into a separate text file. However, I found that on some iterations it doesn't seem to be reading anything (the URL string for it is valid when I type it into the address bar, so it's not an invalid URL).

I put the streams and WebResponse objects in using blocks, and I also have a try…catch block, but no exception occurs. Is using WebRequest and WebResponse problematic within loops?

try
{
    foreach (string name in names)
    {
        string urlstr = "…"; // URL format like I mentioned earlier

        HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(urlstr);
        myRequest.Timeout = 30000;

        //store the response in myResponse 
        using (HttpWebResponse myResponse = (HttpWebResponse)myRequest.GetResponse())
        {
            //register I/O stream associated with myResponse
            using (Stream myStream = myResponse.GetResponseStream())
            {
                //create StreamReader that reads characters one at a time
                using (StreamReader myReader = new StreamReader(myStream))
                {
                    myReader.ReadLine();
                    sw.WriteLine(name + " " + myReader.ReadLine());
                }
            }
        }
    }

    sw.Close();
}

The result will look similar to this:

name1 stuffReadfromfile
name2 stuffReadfromfile
name3 stuffReadfromfile
name4                        
name5 stuffReadfromfile
name6 
name7 stuffReadfromfile
name8 stuffReadfromfile
name9 
name10 stuffReadfromfile

even though there should be stuffReadfromfile after each name.

Try reading myResponse.StatusCode before attempting to get the response stream. — Mark H, Feb 11 '12 at 09:14

score 1 · Answer 1 · answered Feb 11 '12 at 10:37

1

Two things here:

First : Try to read entire response in a string and then process the string using ReadToEnd():

//create StreamReader that reads characters one at a time
using (StreamReader myReader = new StreamReader(myStream))
{
    string content = myReader.ReadToEnd();
    // Process content
}

And second thing: Try to set request.CachePolicy so you are always sure that you get the latest content from server.

I agree to the above comment about checking the status code before you do anything with content.

Hope that helps

answered Feb 11 '12 at 10:37

Digvijay

361
1
9

The status code is ok for each iteration, and I also set teh CachePolicy, but after some more experimenting I found out that the site I was making queries to blocks if there's too many requests from the same computer or IP within a short time span. On the plus side I learned what StatusCode and CachePolicy is though. – ShadowCrossZero Feb 11 '12 at 19:15

score 0 · Answer 2 · answered Feb 11 '12 at 09:03

0

I would use something like Fiddler to know what is actually going on - whether data you're expecting is returned from the server. BTW, why are you calling ReadLine() twice - cannot the first call swallow your data in some cases?

answered Feb 11 '12 at 09:03

pavel.baravik

689
1
11
21

Webresponse/Webrequest not working

2 Answers2