3

I would like to prepare a string for to put it into a "post" request. Unfortunately all the methods I found to encode an url seem to apply percent-encoding only to a handful of characters. The various methods, eg. HttpUtility.UrlEncode, leave some characters, such as () and § untouched.

Abdul Munim
  • 18,869
  • 8
  • 52
  • 61
user1046221
  • 81
  • 1
  • 8
  • 1
    I might be wrong - but try UrlEncodeUnicode – Marc Gravell Nov 14 '11 at 19:35
  • 1
    The alternative, of course, if just some GetBytes followed by ToString("x2") – Marc Gravell Nov 14 '11 at 19:37
  • What I stated above, applies to HttpUtility.UrlEncodeUnicode, too. Among others the characters !, () and < aren't escaped. – user1046221 Nov 14 '11 at 19:39
  • Why do you need those characters escaped? The reason they're not escaped is because they don't need to be. A POST handler should be able to handle whatever gets created by `HttpUtility.UrlEncode`. – Jacob Nov 14 '11 at 21:35

2 Answers2

5

Is this more what you're looking for?

string input = @"such as () and § untouched.";
//Console.WriteLine(input);
Console.WriteLine(HttpUtility.UrlEncodeUnicode(input));
Console.WriteLine(HttpUtility.UrlEncode(input));
string everything = string.Join("", input.ToCharArray().Select(c => "%" + ((int)c).ToString("x2")).ToArray());
Console.WriteLine(everything);
Console.WriteLine(HttpUtility.UrlDecode(everything));

//This is my understanding of what you're asking for:
string everythingU = string.Join("", input.ToCharArray().Select(c => "%u" + ((int)c).ToString("x4")).ToArray());
Console.WriteLine(everythingU);
Console.WriteLine(HttpUtility.UrlDecode(everythingU));

which outputs:

such+as+()+and+%u00a7+untouched.
such+as+()+and+%c2%a7+untouched.
%73%75%63%68%20%61%73%20%28%29%20%61%6e%64%20%a7%20%75%6e%74%6f%75%63%68%65%64%2e
such as () and � untouched.

%u0073%u0075%u0063%u0068%u0020%u0061%u0073%u0020%u0028%u0029%u0020%u0061%u006e%u0064%u0020%u00a7%u0020%u0075%u006e%u0074%u006f%u0075%u0063%u0068%u0065%u0064%u002e
such as () and § untouched.
Thymine
  • 8,775
  • 2
  • 35
  • 47
0

Unfortunately, most existing converters only look at the non-compatible URL characters. Mainly for (as you mentioned) URL encoding, or for the prevention of cross site scripting.

If you want to make one from scratch, it would take some time to do the look-up, but it would be interesting. You could override existing encoders and add the additional characters that concern you, or change all letters and numbers too.

Here is a good link for UTF/ASCII --> HTML Encoded (%)

Farmer Joe
  • 6,020
  • 1
  • 30
  • 40
Ray K
  • 1,452
  • 10
  • 17
  • Why doesn't .net incorporate such a converter? The lack is embarassing and frustrating when trying to proceed quickly with the development. I was hoping not to need to descend to such basic tasks. – user1046221 Nov 14 '11 at 19:42
  • But it did show the hexadecimal value of each character, which is pretty much the same us URL encoding (at least in my understanding). – user1046221 Nov 18 '11 at 13:50