To create a new node, use the HtmlNode.CreateNode()
factory method, do not use the constructor directly.
This code should work out for you:
var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);
var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
var newNodeStr = "<foo>bar</foo>";
var newNode = HtmlNode.CreateNode(newNodeStr);
item.ParentNode.ReplaceChild(newNode, item);
}
Note that we need to call ToList()
on the query, we will be modifying the document so it would fail if we don't.
If you wish to replace with this string:
"some text <b>node</b> <strong>another node</strong>"
The problem is that it is no longer a single node but a series of nodes. You can parse it fine using HtmlNode.CreateNode()
but in the end, you're only referencing the first node of the sequence. You would need to replace using the parent node.
var htmlStr = "<b>bold_one</b><strong>strong</strong><b>bold_two</b>";
var doc = new HtmlDocument();
doc.LoadHtml(htmlStr);
var query = doc.DocumentNode.Descendants("b");
foreach (var item in query.ToList())
{
var newNodesStr = "some text <b>node</b> <strong>another node</strong>";
var newHeadNode = HtmlNode.CreateNode(newNodesStr);
item.ParentNode.ReplaceChild(newHeadNode.ParentNode, item);
}
foo bar
" then it truncates it to "blah blah" with no nested p tag. Same with "blah blah" - leaves out the
. I had to swap out my code with HtmlDocument.LoadHtml. But I'm sure it used to work - maybe a debug/compile quirk. – Etherman Nov 28 '16 at 14:58
with double new line and
is replaced, on the second iteraion the item.ParentNode is null. Do you know how I can handle it? Thank you
. Div Handler does the same besides 1 NL `private static void SurroundWithDoubleLineBreak(HtmlNode node) { var text = Environment.NewLine + Environment.NewLine + node.InnerHtml + Environment.NewLine + Environment.NewLine; node.ParentNode.ReplaceChild(HtmlNode.CreateNode(text).ParentNode, node); }`
– VladL Feb 08 '17 at 11:08