1
XDocument xd = XDocument.Load("http://www.google.com/ig/api?weather=vilnius&hl=lt");

The ampersand & isn't a supported character in a string containing a URL when calling the Load() method. This error occurs:

XmlException was unhandled: Invalid character in the given encoding

How can you load XML from a URL into an XDocument where the URL has an ampersand in the querystring?

p.campbell
  • 98,673
  • 67
  • 256
  • 322
Wizard
  • 10,985
  • 38
  • 91
  • 165

3 Answers3

8

You need to URL-encode it as &:

XDocument xd = XDocument.Load(
    "http://www.google.com/ig/api?weather=vilnius&hl=lt");

You might be able to get away with using WebUtility.HtmlEncode to perform this conversion automatically; however, be careful that this is not the intended use of that method.

Edit: The real issue here has nothing to do with the ampersand, but with the way Google is encoding the XML document using a custom encoding and failing to declare it. (Ampersands only need to be encoded when they occur within special contexts, such as the <a href="…" /> element of (X)HTML. Read Ampersands (&'s) in URLs for a quick explanation.)

Since the XML declaration does not specify the encoding, XDocument.Load is internally falling back to default UTF-8 encoding as required by XML specification, which is incompatible with the actual data.

To circumvent this issue, you can fetch the raw data and decode it manually using the sample below. I don’t know whether the encoding really is Windows-1252, so you might need to experiment a bit with other encodings.

string url = "http://www.google.com/ig/api?weather=vilnius&hl=lt";
byte[] data;
using (WebClient webClient = new WebClient())
    data = webClient.DownloadData(url);

string str = Encoding.GetEncoding("Windows-1252").GetString(data);
XDocument xd = XDocument.Parse(str);
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • -1. This answers different question - "what if my query parameter contains ampersand", unlike OP's one. – Alexei Levenkov May 04 '12 at 20:34
  • @AlexeiLevenkov: I believe you might be mistaken. The `XmlException` error quoted by the OP is caused specifically by the unencoded ampersand, and the resolution is explained in my answer. It is their implied assumption – that an ampersand cannot be passed to `Load`, even if encoded – that is incorrect. – Douglas May 04 '12 at 21:02
  • 1
    Unlikely :) OP already asked question why url contructed the way you suggest does not give expected results - http://stackoverflow.com/questions/10455776/c-sharp-google-weather-xml-getting. Note your suggestion changes second query parameter from "hl=..." to "amp;hl=..." which significantly modifies the Url. – Alexei Levenkov May 04 '12 at 21:10
  • The issue wasn’t that I answered a different question, but that the OP’s question (_“How can you load XML from a URL into an XDocument where the URL has an ampersand in the querystring?”_) was based on an incorrect premise. – Douglas May 04 '12 at 21:41
  • Here is your +1 for complete answer. I agree that OP's question was unclear (or rather unrelated to the issue). Maybe adding explanation that your suggestion changes query completely would stop OP from going wrong route. – Alexei Levenkov May 04 '12 at 21:47
  • Added parenthesized note that `&` does not need to be encoded in this case. – Douglas May 04 '12 at 21:53
2

There is nothing wrong with your code - it is perfectly OK to have & in the query string, and it is how separate parameters are defined.

When you look at the error you'll see that it fails to load XML, not to query it from the Url:

XmlException: Invalid character in the given encoding. Line 1, position 473

which clearly points outside of your query string.

The problem could be "Apsiniaukę" (notice last character) in the XML response...

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • Thanks, how to solve this problem, I mean special symbols like `ę` – Wizard May 04 '12 at 20:53
  • You can query stream as data instead of XML and than try to load with UTF-16 encoding (instead of default UTF-8). Should ask Google folks to fix it so. – Alexei Levenkov May 04 '12 at 20:58
  • Search for it, ask separate question if you can't find answer. – Alexei Levenkov May 04 '12 at 21:02
  • +1: I realize that your answer pointed in the right direction from the beginning. One quick observation: It’s unlikely that the XML parsing would have gotten as far as “position 473” if the encoding really were UTF-16. In most cases, the culprit encoding is Windows-1252 (or its closely-related ISO-8859-1), which is identical to UTF-8 (and US-ASCII) for the first 128 characters. – Douglas May 04 '12 at 22:04
0

instead of "&" use "&" or "&amp;" . and it will work fine .

StepUp
  • 36,391
  • 15
  • 88
  • 148