0

Below is an example of how my xml looks currently:

<Records>
  <Record>
    <Field id='NAMEOFBUSINESS'>
        <Value>J's Burgers</Value>
    </Field>
    <Field id='BUSINESSPHONE'>
        <Value>777-888-9999</Value>
    </Field>
  <Record>
</Records>

However, I need it to look like this:

<Records>
  <Record>
    <Field id='NAMEOFBUSINESS'>
        <Value>J&apos;s Burgers</Value>
    </Field>
    <Field id='BUSINESSPHONE'>
        <Value>777-888-9999</Value>
    </Field>
  <Record>
</Records>

Currently my code looks like this:

using (var sr = new StreamReader(filePath, encode))
        {
            xmlDocument.Load(sr);
        }
        XmlNodeList nlist = null;

        XmlNode root = xmlDocument.DocumentElement;
        if (root != null)
        {
            nlist = root.SelectNodes("//Field");
        }

        if (nlist == null)
        {
            return;
        }
        foreach (XmlElement node in nlist)
        {
            if (node == null)
            {
                continue;
            }
            var value = node.Value;
            if (value != null)
            {
                var newValue = value.Replace("'", "&apos;");
                node.Value = newValue;
            }
        }
        using (var xmlWriter = new XmlTextWriter(filePath, encode))
        {
            xmlWriter.QuoteChar = '\'';
            xmlDocument.Save(xmlWriter);
        }            

So I'm needing to escape the "'", but only within the value elements that the apostrophe is present in.

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • 1
    Are you looking for a better way to do the escaping? You haven't said whether this code works or not, or why it's required (a single apostrophe in a node value is perfectly legal XML -- I'd be more concerned about the single-quoted attribute values). – Cᴏʀʏ Jul 14 '15 at 14:08
  • The reason I need it to only be concerned with the value fields is for diffing purposes. This code doesn't work. – Jonathan Underwood Jul 14 '15 at 14:11
  • @Cᴏʀʏ single quoted attribute values are perfectly fine in XML (and useful if you want to include a double quote character within the value without having to escape it as `"`). The only restriction is that you can't put unescaped single quotes within a single-quoted attribute or double quotes within a double-quoted one. – Ian Roberts Jul 14 '15 at 16:41

1 Answers1

1

First, the XML you have is invalid, maybe a typo: in line 9, it should read </Record> instead of <Record>. If this is not fixed, an XML parser will throw an exception.

Other than that, the XML is fine. Apostrophes only need to be escaped in attribute values, not in element values. So, there is actuall no reason to replace it, if not needed by another application.

Right now, you're doing a replacement on the <Field> element, where it is intended to do it on the <Value> element instead. So change

nlist = root.SelectNodes("//Field");
...
var value = node.Value;

to

nlist = root.SelectNodes("//Field/Value");
...
var value = node.InnerText;

This will generate the following XML:

... <Value>J&amp;apos;s Burger</Value> ...

but that's perfectly legitimate. Any XML compliant application will read it back as &apos; as shown in the following code:

var xml = new XmlDocument();
xml.LoadXml("...XML here...");
XmlNodeList nodes = xml.SelectNodes("//Field/Value");
foreach (XmlElement node in nodes)
{
    node.InnerText = node.InnerText.Replace("'", "&apos;");
}
// Result
Console.WriteLine(xml.OuterXml);

// This is what other applications will get
Console.WriteLine(xml.SelectSingleNode("//Field/Value/text()").Value);
Console.ReadLine();
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222