4

I have file:// links with non-english characters which are UrlEncoded in UTF-8. For these links to work in a browser I have to re-encode them.

file://development/H%C3%A5ndplukket.doc

becomes

file://development/H%e5ndplukket.doc

I have the following code which works:

public string ReEncodeUrl(string url)
{
    Encoding enc = Encoding.GetEncoding("iso-8859-1");
    string[] parts = url.Split('/');
    for (int i = 1; i < parts.Length; i++)
    {
        parts[i] = HttpUtility.UrlDecode(parts[i]); // Decode to string
        parts[i] = HttpUtility.UrlEncode(parts[i], enc); // Re-encode to latin1
        parts[i] = parts[i].Replace('+', ' '); // Change + to [space]
    }
    return string.Join("/", parts);
}

Is there a cleaner way of doing this?

Mikael Svenson
  • 39,181
  • 7
  • 73
  • 79
  • In fact the encoding used in the URI encoding depends on the server. W3C recommends using UTF-8. But... when you use file:// URLs the browser is the server so that depends on the browser... if you plan to use it in a non-occidental platform (non ISO-8859-1) check it up... – helios Dec 29 '09 at 11:30

3 Answers3

1

I think that's pretty clean actually. It's readable and you said it functions correctly. As long as the implementation is hidden from the consumer, I wouldn't worry about squeezing out that last improvement.

If you are doing this operation excessively (like hundreds of executions per event) I would think about taking the implementation out of UrlEncode/UrlDecode and stream them into each other to get a performance improvement there by removing the need for string split/join, but testing would have to prove that out anyway and definitely wouldn't be "clean" :-)

loosleef
  • 51
  • 3
  • I'll actually accept your answer on this, since there doesn't seem to be a "quicker" way of doing this. And as you say, it's readable and expresses the intent. – Mikael Svenson Jan 06 '10 at 08:51
0

While I don't see any real way of changing it that would make a difference, shouldn't the + to space replace be before you UrlEncode so it turns into %20?

Don
  • 9,511
  • 4
  • 26
  • 25
  • UrlEncode will turn the space to + for latin1 encodings. That's why I replace it with a space. Could probably have replaced the + with %20 instead. – Mikael Svenson Dec 18 '09 at 13:45
0

admittedly ugly and not really an improvement, but could re-encode the whole thing (avoid the split/iterate/join) then .Replace("%2f", "/")

I don't understand the code wanting to keep a space in the final result - seems like you don't end up with something that's actually encoded if it still has spaces in it?

James Manning
  • 13,429
  • 2
  • 40
  • 64
  • The spaces makes file:// links work in IE in order to open the file at the correct location. I could probably use %20 as stated in my previous comment, but the + has to go. It won't work. – Mikael Svenson Dec 29 '09 at 11:12
  • And it won't be cleaner as I need replacements for %2f(/), %3a(:) and +(space). – Mikael Svenson Dec 29 '09 at 11:28