0

I am attempting to load a page I've received from an RSS feed and I receive the following WebException:

Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.

with an inner exception:

Invalid URI: The hostname could not be parsed.

Here's the code I'm using:

System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
string source = String.Empty;
Uri responseURI;
try
{
    req.UserAgent=@"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:31.0) Gecko/20100101 Firefox/31.0";
    req.Headers.Add("Accept-Language", "en-us,en;q=0.5");
    req.AllowAutoRedirect = true;
    using (System.Net.WebResponse webResponse = req.GetResponse())
    {
        using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
        {
            responseURI = httpWebResponse.ResponseUri;
            StreamReader reader;
            if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))
            {
                reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
            }
            else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))
            {
                reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
            }
            else
            {
                reader = new StreamReader(httpWebResponse.GetResponseStream());
            }
            source = reader.ReadToEnd();
            reader.Close();
        }
    }
}
catch (WebException we)
{
        Console.WriteLine(url + "\n--\n" + we.Message);
        return null;
}

I'm not sure if I'm doing something wrong or if there's something extra I need to be doing. Any help would be greatly appreciated! let me know if there's more information that you need.

############ UPDATE

So after following Jim Mischel's suggestions I've narrowed it down to a UriFormatException that claims Invalid URI: The hostname could not be parsed.

Here's the URL that's in the last "Location" Header: http:////www-nc.nytimes.com/

I guess I can see why it fails, but I'm not sure why it gives me trouble here but when I take the original url it processes it just fine in my browser. Is there something I'm missing/not doing that I should be in order to handle this strange URL?

shadonar
  • 1,114
  • 3
  • 16
  • 40
  • 1
    Set `req.AllowAutoRedirect = false;` and run the code. When you get the response, check the "Location" header. That will tell you what URL it's trying to redirect to. Also, if you set [AutomaticDecompression](http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.automaticdecompression.aspx) `= DecompressionModes.Deflate | DecompressionModes.GZip`, it will be handled for you automatically. – Jim Mischel Jul 15 '14 at 18:31
  • @JimMischel so, basically I need to manually handle the redirects, correct? I just want to make sure I understand what you're saying. – shadonar Jul 15 '14 at 19:28
  • 1
    I don't know if you have to manually handle the redirects. My comment was to help you in debugging. Something about the redirect url is apparently causing `HttpWebRequest` to throw an exception. My comment tells you how to find out what url it's being redirected to. – Jim Mischel Jul 15 '14 at 20:04
  • @JimMischel Thank you for clarifying. it's definitely helping me to narrow down what the problem is. – shadonar Jul 15 '14 at 20:08

0 Answers0