5

I'm downloading XML files from sharepoint online using webclient.

However, when I use WebClient.DownloadString(string url) method, some characters are not correctly decoded.

When I use WebClient.DownloadFile(string url, string file) and then I read the file all characters are correct.

The xml itself does not contain encoding declaration.

string wrongXml = webClient.DownloadString(url);
//wrongXml contains Ä™ instead of ę

webClient.DownloadFile(url, @"C:\temp\file1.xml");
string correctXml = File.ReadAllText(@"C:\temp\file1.xml");
//contains ę, like it should.

Also, when open the url in Internet Explorer, it is shown correctly.

Why is that? Is it because of the default windows encoding on my machine or webclient handles responses differently when using DownloadString, resp DownloadFile?

Camilo Terevinto
  • 31,141
  • 6
  • 88
  • 120
Liero
  • 25,216
  • 29
  • 151
  • 297

1 Answers1

3

Probably the encoding it is using now is not the one the service returns.

You can set the encoding you expect before you make the request:

webClient.Encoding = Encoding.UTF8;
string previouslyWrongXml = webClient.DownloadString(url);
Patrick Hofman
  • 153,850
  • 22
  • 249
  • 325
  • Good to know, but what if I don't know the encoding. – Liero Jan 03 '18 at 13:13
  • I wonder if the service returns the correct encoding in the header. If it does, you can use that to read the response bytes and convert them to the correct encoding. You might need `HttpWebRequest` then, but I am not sure if it is possible with `WebClient` instead. – Patrick Hofman Jan 03 '18 at 13:14
  • It contains: `Content-Disposition: attachment; filename*=UTF-8'aaa.xml; filename="aaa.xml"`, so I can use that when using HttpWebRequest, but since it seems to be the same all the time in my case I will just set it to webclient – Liero Jan 03 '18 at 13:31