Specify optional element in XPath path?

Question

Consider the following HTML:

<div>
  text1
</div>
<div>
  <span>
    text2
  </span>
</div>
<div>
  text3
</div>

I need to select all the nodes with text1/text2/text3. When I use

/html/body/div[position() > 0]

I obviously don't get the span around text2, but the div around <span>text2</span>. How can I say: If there is a span following the div, then return the span; if the div is already the last element in a path, return the div? So the intended nodes would be:

div[0]
div[1]/span
div[2]

Update: This one works, but is there a shorter way to do it? (e.g. I am writing /html/body/divin both of them, is it possible to make the pipe symbol (or) at a later place?)

/html/body/div[position() > 0 and count(*) = 0] | /html/body/div[position() > 0]/span

You may take a look at this post: https://stackoverflow.com/questions/14631590/get-text-content-of-an-html-element-using-xpath . Using /html/body/div/text() should result what you need. But I may misunderstand what you want. — Gaël Barbin, Apr 10 '21 at 14:30
What version of XPath? In XPath 2.0+, you can do `/html/body/div/(span | self::div[not(span)])`, but XPath 1.0 doesn't support that syntax, so you're either stuck with `/html/body/div[not(span)] | /html/body/div/span` or first select all `/html/body/div` and then select `span | self::div[not(span)]` from there. — JLRishe, Apr 10 '21 at 14:35
@Gaël Thanks, it worked with `/html/body/div//text()` /note the two `//`before `text()`) :-) Please consider posting as answer. — stefan.at.kotlin, Apr 10 '21 at 14:41
@JLRishe Thanks, I had to look up the version, but as I am in a browser context (Chrome) it's 1.0 as I learned. But yes, the one for version 2 would have been what I expected. Unfortunately I am limited to 1.0 ): You could also post as an answer for those able to use 2.0 — stefan.at.kotlin, Apr 10 '21 at 14:42

Gaël Barbin · Accepted Answer · 2021-04-10T15:00:00.547

0

I order to select a node with text content in it, you can use the text() selector.

So if you want select all nodes with some text content form a root node, you can use this xpath selector:

//ROOT_NODE//text()

So, for your example and as you said in your comment:

/html/body/div//text()

edited Apr 10 '21 at 15:00

answered Apr 10 '21 at 14:53

Gaël Barbin

3,769
3
25
52

Specify optional element in XPath path?

1 Answers1