0

My program is a HTML parser and it saves everything into XML file. The problem is when I'm trying open a file and read the text, it gives me for example:"NAME" when it should be "NAME" It seems that when I use .Replace(""", """) It writes &quot as &amp ; quot ; once again. How should I handle it?

Edit:

It's <td> "IN QUOTE" BLA BLA BLA</td>

I do save this right here:

 debt.Debtor.LegalPerson.Name = nazwa;

While debuging, the string I get is: &quot;IN QUOTE&quot; BLA BLA BLA

But when I write everything into XML

var serializer = new XmlSerializer(typeof(BGW_IMPORT));
            serializer.Serialize(writer, bgw);
        }

       ...
        }
        if (File.Exists(FilePath))
        {
            //XDocument existing;
            XmlDocument ex = new XmlDocument();
            XmlDocument docX = new XmlDocument();

            using (FileStream fs = new FileStream(FilePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None))
            {
                //existing = XDocument.Load(fs);

                docX.LoadXml(doc.Document.ToString());
                ex.Load(fs);

                foreach (XmlNode wiersz in docX.SelectNodes("//Debt"))
                {
                    XmlNode importNode = ex.ImportNode(wiersz, true);
                    ex.DocumentElement["Debts"].AppendChild(importNode);

                }
            }

            File.Delete(FilePath);
            using (FileStream fs = new FileStream(FilePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None))
            {
                ex.Save(fs);

            }...

At the end I get:

<Name>&amp;quot;IN QUOTE&amp;quot;BLA BLA BLA</Name>

When I want a:

<Name>&quot;IN QUOTE&quot;BLA BLA BLA</Name>
Qbej
  • 3
  • 2
  • 5

1 Answers1

0

you first need to encode your string using System.Web.HttpUtility.HtmlEncode ()

and then decode using HtmlDecode()

refer to the link: https://msdn.microsoft.com/en-us/library/7c5fyk1k(v=vs.110).aspx

amit dayama
  • 3,246
  • 2
  • 17
  • 28
  • `instead of .Replace(""", """)` I don't have such a replacement. and replacing &amp with "\"" would leave me with "quot; – Qbej Sep 23 '15 at 11:16
  • That's weird but it does nothing, I mean it does, but I feel like this & is again replaced by &amp ; at the end. Don't know... maybe it's some type of encoding problem? – Qbej Sep 23 '15 at 11:29
  • oh yeah i remember .. it's related to System.Web.HttpUtility.HtmlEncode () and HtmlDecode() https://msdn.microsoft.com/en-us/library/7c5fyk1k(v=vs.110).aspx this link would help you. – amit dayama Sep 23 '15 at 11:35