1

I have problems with characters encoding received from http web response, I receive ? instead é.

I set the encoding to according Content-Type of web page that's text/javascript; charset=ISO-8859;

My code is:

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(..);
request.Method = "GET";
request.AllowAutoRedirect = false;
request.Referer = "Mozilla/5.0 (Windows NT 6.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1";
request.Headers.Add("DNT", "1");
request.Accept = "text/html,application/xhtml+xml,application/xml";

HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader sr = new StreamReader(stream, Encoding.GetEncoding("iso-8859-1"));

char[] buf = new char[256];
int count;
StringBuilder buffer = new StringBuilder();

while ((count = sr.Read(buf, 0, 256)) > 0)
{
    buffer.Append(buf, 0, count);
}

string responseStr = buffer.ToString();
Console.WriteLine(responseStr);
response.Close();
stream.Close();
sr.Close();

Can you tell me what is wrong with it?

simonc
  • 41,632
  • 12
  • 85
  • 103
Jack
  • 16,276
  • 55
  • 159
  • 284

2 Answers2

2

Try adding the following before you make your request:

request.Headers.Add(HttpRequestHeader.AcceptCharset, "ISO-8859-1");

Btw, you should keep your StreamReader with ISO-8859-1 (instead of UTF8) if you want to try my proposed solution. Good luck!

Tung
  • 5,334
  • 1
  • 34
  • 41
1

Have you tried setting it at UTF-8? Further more you send a referrer which I think you tried to set the UserAgent. The code below is the same as yours, but then does not go over the byte array and sets the useragent and utf8 encoding.

var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
request.AllowAutoRedirect = false;
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1";
request.Headers.Add("DNT", "1");
request.Accept = "text/html,application/xhtml+xml,application/xml";

using(var response = (HttpWebResponse)request.GetResponse())
using(var stream = response.GetResponseStream())
using (var sr = new StreamReader(stream, Encoding.UTF8))
{
    string responseStr = sr.ReadToEnd();
    Console.WriteLine(responseStr);
    response.Close();
    if (stream != null)
        stream.Close();
    sr.Close();
}
simonc
  • 41,632
  • 12
  • 85
  • 103
Henk J Meulekamp
  • 2,839
  • 1
  • 20
  • 13
  • btw, this has a good explanation how to read different encodings when utf8 is not working: http://blogs.msdn.com/b/feroze_daud/archive/2004/03/30/104440.aspx – Henk J Meulekamp Dec 02 '11 at 03:02
  • I'II check out the link. Thanks. – Jack Dec 02 '11 at 03:10
  • 1
    Maybe the server is doing something wrong, see this one for example: http://stackoverflow.com/questions/638756/httpwebrequest-receiving-response-with-the-right-encoding Then you can try to swap to: using (var sr = new StreamReader(stream, Encoding.GetEncoding(1252))) – Henk J Meulekamp Dec 02 '11 at 03:24