2

I'm having a problem getting HttpWebRequest to use ISO-8859-1 encoding for parametres in a webrequest, the problem is related to both POSTs and GETs.

The problem in a nutshell is that any request parametres that contain non-ascii characters like Ö and æ, gets converted to their UTF-8 % representations rather than their ISO-8859-1 representations.

Ö gets converted to %c3%96 instead of %d6.

My current idea for a solution is to convert the request-string to an ISO-8859-1 byte array, and then convert the bytearray back to UTF-8, char for char, while catching any bytes > 127 and converting those to their %hex values instead.

Is there a better way of solving this issue?

Grubsnik
  • 918
  • 9
  • 25

2 Answers2

2

Create your own URL-encoding algorithm as follows; the WebRequest method will use the URI you provide with your custom encoding.

string input = "http://www.example.com/q?Ö=æ";

StringBuilder sb = new StringBuilder();
foreach (byte by in Encoding.GetEncoding("ISO-8859-1").GetBytes(input))
{
    // NOTE: This is very simplistic; a robust solution would probably really need
    // to handle all non-alphanum and non-reserved characters, as specified by
    // http://www.ietf.org/rfc/rfc2396.txt
    if (by <= 0x7F)
        sb.Append((char) by);
    else
        sb.Append(string.Format("%{0:X2}", by));
}

Uri uri = new Uri(sb.ToString());
// uri.AbsoluteUri == "http://www.example.com/q?%D6=%E6"

WebRequest request = WebRequest.Create(uri);
using (request.GetResponse())
{
    // ...
}
Bradley Grainger
  • 27,458
  • 4
  • 91
  • 108
  • 1
    This was the kind of solution i was hoping to avoid, but never the less a very elegant implementation of it. – Grubsnik Nov 12 '10 at 09:12
0

I would rather try and fix "the other side of the pipe" and make it accept utf-8. UTF-8 is the way to go if you want to be "future proof"

Mihai Nita
  • 5,547
  • 27
  • 27
  • 1
    We don't have any form of control over the websites we are accessing. So the fact that the websites can be horribly outdated, is not something we can address. – Grubsnik Nov 15 '10 at 08:21