I'm trying to come up with a clean Xpath 1.0 expression to return all nodes in between specific nodes. The requirements are:
- All nodes after the very last numeric node;
- but before the very first node that starts with a opening paranthesis.
To test what I'm doing you could use:
<t>
<s>ABC</s>
<s>123</s>
<s>DEF</s>
<s>456</s>
<s>GHI</s>
<s>JKL</s>
<s>(M)</s>
<s>NOP</s>
<s>(Q)</s>
</t>
To only end up with both 'GHI' and 'JKL' I came up with:
//s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1][position()>count(//s[.*0=0][last()]/preceding::*)+1]
It works just fine, however I can't help but feeling this can be done much smoother. To do so I though there should be a way to use the retrieved nodes in prior Xpath, e.g:
- To get all nodes after the last numeric node:
[.*0=0][last()]/following::s
. - To get the 1st nodes that starts with a opening paranthesis:
[starts-with(., '(')][1]
. - But I fail to combine the statements into a valid syntax as I think something along the lines of:
//s[.*0=0][last()]/following::s[position()<[starts-with(.,'(')][1]]
which obviously is incorrect.
Any ideas, or am I stuck with what I got in the first place? I'm using this in the Excel function FILTERXML()
.