How to search both class and ID names in HTML code using same xpath query?

Question

I am using the below code to search the class name abc in the HTML code:

nodes = doc.DocumentNode.SelectNodes("//*[contains(concat(' ', normalize-space(@class), ' '), ' abc ')]");

Which is giving me correct result.

But if I want to search ID name abc instead of Class, the above code is not working.

Maybe the code which I am using contains @class word hence it is not working for ID names.

Is there any way to search both "Class" and "ID" names using same code?

@Ranon, I believe the question is about HTML code. Possibly using the [HtmlAgilityPack](http://htmlagilitypack.codeplex.com/). — Alex Filipovici, Jan 30 '13 at 13:41
XPath is on XML, so if he want's to use it oh HTML it must be valid XML (or he has to use some HTML parser which forms it to valid XML). Doesn't matter anyway, he should add some input. — Jens Erat, Jan 30 '13 at 13:43
Hi, I am using HtmlAgilityPack to get the HTML code part for which i am searching the class or ID name. I am not using any XML code here — user2025463, Jan 30 '13 at 13:54

score 0 · Accepted Answer · edited May 23 '17 at 12:12

First of all, ID is different than id. HTML is case-insensitive, but XML is case-sensitive. Take a look at this answer from Simon Mourier, the author of HtmlAgilityPack.

That said, when you use its XPATH feature, you must use tags written in lower case. It means the "//body" expression will match BODY, Body and body, and "//BODY" will match nothing.

The same stands for ID.

Regarding the filter logic, you have to use the Logical-and operator:

var nodes = doc.DocumentNode.SelectNodes(
    "//*[contains(concat(' ', normalize-space(@class), ' '), ' abc ')"+
        " and " +
        "contains(concat(' ', normalize-space(@id), ' '), ' div1 ')]");

Or, simpler:

var nodes = doc.DocumentNode.SelectNodes("//*[@class=\"abc\" and @id=\"div1\"]");

But as a personal preference, if the context allows it, I would use LINQ to do it:

var nodes = doc.DocumentNode.Descendants()
    .Where(i => 
        i.Attributes["class"] != null 
        && i.Id != null 
        && i.Attributes["class"].Value == "abc" 
        && i.Id == "div1");

How to search both class and ID names in HTML code using same xpath query?

1 Answers1