Please note: This question is a more refined version of a previous question.
I am looking for an XPath that lets me find elements with a given plain text in an HTML document. For example, suppose I have the following HTML:
<html>
<head>...</head>
<body>
<someElement>This can be found</someElement>
<nested>
<someOtherElement>This can <em>not</em> be found most nested</someOtherElement>
</nested>
<yetAnotherElement>This can <em>not</em> be found</yetAnotherElement>
</body>
</html>
I need to search by text and am able to find <someElement>
using the following XPath:
//*[contains(text(), 'This can be found')]
I am looking for a similar XPath that lets me find <someOtherElement>
and <yetAnotherElement>
using the plain text "This can not be found"
. The following does not work:
//*[contains(text(), 'This can not be found')]
I understand that this is because of the nested em
element that "disrupts" the text flow of "This can not be found". Is it possible via XPaths to, in a way, ignore such or similar nestings as the one above?