C# Won't Load A Certain XML, But Works in Browser

Question

I'm new(er) to C#, but I have a background with Java and VB.NET, so jumping in was easy. This weekend I started a new mini-project with C# and a public XML feed from the interwebs. But I'm having a problem loading the XML. Here's my code:

string url = ... ;
...
XmlDocument xmlDoc = new XmlDocument();
...         
try{                
    xmlDoc.Load(url);
}catch(Exception e){
    Console.WriteLine(e);
}

When I attempt to load the XML, it throws an exception:

https://i.stack.imgur.com/Xo2Ra.png (Newbies can't attach pictures, sorry)

I wasn't at all surprised when my code didn't work. I started the standard troubleshooting process by figuring out where the problem was. I fully expected my code to be faulty. To test this theory, I found a random XML feed on the web and copied it into my code. To my surprise, it loaded just fine. Now my suspicion shifted to the target XML. It works fine in Chrome and FireFox (loads in .734 seconds), does not require any credentials (open to public), and is valid/well formed.

Then I remembered a JavaScript that I had written a few months ago that uses this same feed. I fired that up, and found it to also be working perfectly.

I'm at a loss here because both my code and XML seem to be fine. Does anyone know how this can be fixed? Do I need to use a HttpWebRequest and pass to the XmlDocument (I don't know how to do this)? Are there any more ways to troubleshoot this?

XmlDoc.Load is relatively primitive for fetching content from the Web. What is the URL your dealing with? What protocol? and is it secured? (https?) both can give problems when using XmlDocument.Load out of the box — Polity, Oct 17 '11 at 03:18
Here's the URL, straight from the address bar: http://stats.us.playstation.com/warhawk/XmlFeedAction.action?start=1&end=1 ... EDIT: I just loaded the XML in Chrome perfectly fine and directly copied/pasted the URL to this comment. When I checked the link, I got a 404 error. — Sonic42, Oct 17 '11 at 03:39

score 4 · Accepted Answer · answered Oct 17 '11 at 03:59

As i indicated in my comment, XmlDocument.Load is farely primitive compared to a full blown request from a browser. When you use a proxy- or packet tracer like Fiddler, you will find that for example IE9 makes a request including specific headers:

GET http://stats.us.playstation.com/warhawk/XmlFeedAction.action?start=1&end=1 HTTP/1.1 Accept: text/html, application/xhtml+xml, / Accept-Language: en-US User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) Accept-Encoding: gzip, deflate Connection: Keep-Alive Host: stats.us.playstation.com Cookie: JSESSIONID=HLygTblTG13HhXqqw80jw9Wdhw0q03dxcQLp04fD3Q5yChYvPGn6!-882698034; SONYCOOKIE1=543467712.20480.0000

Now the webserver's behavior is subjected to the headers specified in a request. In this case, the Accept and user-agent play a role. I can succesfully load the xml content in a XmlDocument by including some fake headers like the following:

        string url = "http://stats.us.playstation.com/warhawk/XmlFeedAction.action?start=1&end=1";

        WebClient client = new WebClient();
        client.Headers["User-Agent"] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1";
        client.Headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        string data = client.DownloadString(url);

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(data);

Nice answer. So the problem isn't the XML, its that the server isn't responding the same way when it gets browser request headers as it does when it gets the request without headers from the c# XML dom object, right? — Chris Shain, Oct 17 '11 at 04:10
@ChrisShain - Exactly, you have to replicate the behavior of a browser manually in your code or use a more advanced/specific library to do this for you — Polity, Oct 17 '11 at 04:11
Thank you! I knew xmlDoc.Load() wasn't the best way to go about doing this (the reason I mentioned trying to use HttpWebRequest in the OP), but I really wasn't sure what else to use/how to use it. You've earned a footnote, good sir. — Sonic42, Oct 17 '11 at 04:18

C# Won't Load A Certain XML, But Works in Browser

1 Answers1

Linked