6
using System;

namespace UnicodeRlm
{
    class Program
    {
        static void Main(string[] args)
        {
            var uri = new Uri(
                "https://example.com/attachments/The title is \"مفتاح معايير الويب!‏\" in Arabic.pdf");
            Console.WriteLine(uri.AbsolutePath);
            Console.WriteLine(uri.AbsolutePath.Length);
        }
    }
}

Under .NET 4.0, this produces

/attachments/The%20title%20is%20%22%D9%85%D9%81%D8%AA%D8%A7%D8%AD%20%D9%85%D8%B9%D8%A7%D9%8A%D9%8A%D8%B1%20%D8%A7%D9%84%D9%88%D9%8A%D8%A8!%E2%80%8F%22%20in%20Arabic.pdf
168

Under .NET 4.5+, this produces

/attachments/The%20title%20is%20%22%D9%85%D9%81%D8%AA%D8%A7%D8%AD%20%D9%85%D8%B9%D8%A7%D9%8A%D9%8A%D8%B1%20%D8%A7%D9%84%D9%88%D9%8A%D8%A8!%22%20in%20Arabic.pdf
159

.NET 4.5 drops the %E2%80%8F part, which is the RLM character:

...!%E2%80%8F%22%20in%20Arabic.pdf
...!%22%20in%20Arabic.pdf

I have a hypothesis that this is caused by System.Uri escaping now supports RFC 3986, but my RFC-fu and Unicode-fu are failing me as to whether this RFC requires RLM to be dropped or wither this RLM character is placed correctly at all in the original string.

I'm not entirely sure whether this is the correct behavior standards-wise, but for me it's certainly not since I cannot download a file with an RLM character in the name in .NET 4.5 neither with WebClient nor with HttpWebRequest.

Is there any way to work around this quirk?

Anton Gogolev
  • 113,561
  • 39
  • 200
  • 288
  • Does this answer your question? [System.Uri.ToString behaviour change after VS2012 install](https://stackoverflow.com/questions/12004214/system-uri-tostring-behaviour-change-after-vs2012-install) – Peter O. Jan 20 '21 at 08:45
  • Isn't this [expected behavior](https://learn.microsoft.com/en-us/dotnet/api/system.uri?view=net-5.0)? – Zer0 Jan 26 '21 at 06:58
  • if you `UrlEncode` the text before you create the uri? does UrlEncode behaves the same? or it honors the rlm on both versions? – Tch Jan 26 '21 at 18:40

1 Answers1

7

In .Net 4.5 International Resource Identifier support was enabled by default. When targeting .Net 4.7.2 the right-to-left mark seems to be honored again, this could indicate there was a bug.

If the project needs to target .Net 4.5, the method ToggleIDNIRISupport in this post can help to overcome the issue.

Call the method like this:

ToggleIDNIRISupport(false);

When constructing the URI after this method call, it contains the right-to-left mark.

alex-dl
  • 802
  • 1
  • 5
  • 12
  • Seems to be right (upvote); might be a bug really, i've been reading the reference documentation for the contructor for Uri, but is a "big" code, and it's published only for 4.8, so even if it works well in 4.8, for 4.5 may not. – SammuelMiranda Feb 01 '21 at 17:38