0

I have the following test case:

    [TestMethod]
    public void SimpleEncodingTest()
    {
        var report = new SimpleReport{Title = @"[quote]""[/quote] [apo]'[/apo] [smaller]<[/smaller] [bigger]>[/bigger] [and]&[/and]" };


        XmlSerializer xsSubmit = new XmlSerializer(typeof(SimpleReport));

        var xml = "";

        using (var sww = new StringWriter())
        {
            using (XmlWriter writer = XmlWriter.Create(sww, new XmlWriterSettings
            {
                Encoding = Encoding.Default
            }))
            {
                xsSubmit.Serialize(writer, report);
                xml = sww.ToString(); // Your XML
            }
        }


    }

I want all special characters including the quotes at apostrophe to be included as such:

    <?xml version="1.0" encoding="utf-16" ?>
    <SimpleReport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <Title>[quote]&quot;[/quote] [apo]&apos;[/apo] [smaller]&lt;[/smaller] [bigger]&gt;[/bigger] [and]&amp;[/and]</Title>
    </SimpleReport>

With the title being "[quote]"[/quote] [apo]'[/apo] [smaller]<[/smaller] [bigger]>[/bigger] [and]&[/and]"

Instead I get:

    <?xml version="1.0" encoding="utf-16" ?>
    <SimpleReport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <Title>[quote]"[/quote] [apo]'[/apo] [smaller]&lt;[/smaller] [bigger]&gt;[/bigger] [and]&amp;[/and]</Title>
    </SimpleReport>

And the title is [/quote] [apo]'[/apo] [smaller]<[/smaller] [bigger]>[/bigger] [and]&[/and].

How do I tell the serializer that I have quotes and apostrophes encoded as well?

PS: I know you don't typically need to encode these characters but this is a client requirement.

Attempts:

user1531921
  • 1,372
  • 4
  • 18
  • 36
  • You can use System.Net.WebUtility.HtmlDecode(string) and using System.Net.WebUtility.HtmlEncode(string) to replace HTML special characters. – jdweng Feb 24 '20 at 14:36
  • I've tried that. System.Net.WebUtility.HtmlDecode does not encode quotes. – user1531921 Feb 24 '20 at 14:46
  • The viewer you are using may be showing string with double quotes that really aren't in the string. Double quotes do not need to be encoded. See Wiki : https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references – jdweng Feb 24 '20 at 14:50
  • I am aware of that. This is a customer requirement. – user1531921 Feb 24 '20 at 14:51

1 Answers1

0

How? Since they are not in an attribute, tell your client you encoded them in UTF16 - wich you did. Otherwise you can usually use the SecurityElement.Escape(String) Method to escape a string, wich will lead to double escaping here. Sadly even doing the

" -> &quot;
' -> &apos;

transitions your self, by

Title = text.Replace("\"", "&quot;").Replace("'", "&apos;")

leads to double quotation... But at least as far as i know these are the only ones not automatically escaped between XML nodes, since they are valid at that point. So I'd think it's not possible the way your customer wants it. at least not with standardized serializers. Sorry

Patrick Beynio
  • 788
  • 1
  • 6
  • 13
  • 1
    I tried using the SecurfityElement.Escape method as such: " report.Title = SecurityElement.Escape(report.Title);" I've tried using SecuritElement.Escape()." However, this confused the encoder and gave the following: "[quote]&quot;[/quote] [apo]&apos;[/apo]" – user1531921 Feb 24 '20 at 14:19
  • 1
    Hi again.I saw your updated answer. But my question remains. Do you mean me to apply the replace function before or after the xml serialization? Because if I perform it before, the serializer encodes "&quot" as &quot;. – user1531921 Feb 24 '20 at 14:35
  • you were right. that lead to double quoting... i updated my answer – Patrick Beynio Feb 24 '20 at 14:35
  • omg you're still right xD than i really don't know :( than i'd think it's not possible the way your customer wants it. at least not with standardized serializers. sry – Patrick Beynio Feb 24 '20 at 14:38
  • Well anyway, thanks for your input. You could be right. However, I don't feel it's safe to to a replace on the entire xml after serialization,. It might have unexpected results. – user1531921 Feb 24 '20 at 14:42
  • yea don't do it on the output that might be fatal :D – Patrick Beynio Feb 24 '20 at 15:00
  • Well, I don't see how else to solve it. The safest workaround would be to before serialization replace " with [quote] and after serialization replace [quote] with "". – user1531921 Feb 24 '20 at 15:02