I'm very new to C# and XML files in general, but currently I have an XML file that still has some html markup in it (&, ;quot;, etc.) and I want to read through the XML file and remove all of those so it becomes easily readable. I can open and print the file to the console with no issue, but I'm stumped trying to search for those specific strings and remove them.
Asked
Active
Viewed 320 times
1
-
maybe `String.IndexOf` or `String.Replace` is what you're searching for? – Gusman Sep 25 '20 at 15:36
-
Looks like you mean html markup when talking about "script syntax". This might help: https://stackoverflow.com/questions/2720684/c-function-to-replace-all-html-special-characters-with-normal-text-characters – Christoph Lütjen Sep 25 '20 at 15:42
-
Can you present a small sample input file and what you would want the output to look like? – NineBerry Sep 25 '20 at 15:47
2 Answers
0
I suppose you are looking for this: https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?view=netcore-3.1
Converts a string that has been HTML-encoded for HTTP transmission into a decoded string.
// Encode the string.
string myEncodedString = HttpUtility.HtmlEncode(myString);
Console.WriteLine($"HTML Encoded string is: {myEncodedString}");
StringWriter myWriter = new StringWriter();
// Decode the encoded string.
HttpUtility.HtmlDecode(myEncodedString, myWriter);
string myDecodedString = myWriter.ToString();
Console.Write($"Decoded string of the above encoded string is: {myDecodedString}");
Your string is html encoded, probably for transmission over network. So there is a built in method to decode it.

Athanasios Kataras
- 25,191
- 4
- 32
- 61
0
One way to do this would be to put all the words you want to remove into an array, and then use the Replace
method to replace them with empty strings:
var xmlFilePath = @"c:\temp\original.xml";
var newFilePath = @"c:\temp\modified.xml";
var wordsToRemove = new[] {"&", ";quot;"};
// Read existing xml file
var fileContents = File.ReadAllText(xmlFilePath);
// Remove words
foreach (var word in wordsToRemove)
{
fileContents = fileContents.Replace(word, "");
}
// Create new file with words removed
File.WriteAllText(newFilePath, fileContents);

Rufus L
- 36,127
- 5
- 30
- 43