First of all, ID
is different than id
. HTML is case-insensitive, but XML is case-sensitive. Take a look at this answer from Simon Mourier, the author of HtmlAgilityPack
.
That said, when you use its XPATH feature, you must use tags written
in lower case. It means the "//body" expression will match BODY, Body
and body, and "//BODY" will match nothing.
The same stands for ID
.
Regarding the filter logic, you have to use the Logical-and
operator:
var nodes = doc.DocumentNode.SelectNodes(
"//*[contains(concat(' ', normalize-space(@class), ' '), ' abc ')"+
" and " +
"contains(concat(' ', normalize-space(@id), ' '), ' div1 ')]");
Or, simpler:
var nodes = doc.DocumentNode.SelectNodes("//*[@class=\"abc\" and @id=\"div1\"]");
But as a personal preference, if the context allows it, I would use LINQ
to do it:
var nodes = doc.DocumentNode.Descendants()
.Where(i =>
i.Attributes["class"] != null
&& i.Id != null
&& i.Attributes["class"].Value == "abc"
&& i.Id == "div1");