Here's a sample XML:
<?xml version="1.0" ?>
<someparent>
<somechild>
<description>I want this</description>
<id>98</id>
</somechild>
<somechild>
<description>I don't want this</description>
<id>98</id>
</somechild>
<somechild>
<description>I want this too</description>
<id>2</id>
</somechild>
<somechild>
<description>Nope, not that one</description>
<id>2</id>
</somechild>
<somechild>
<description>Not that one either</description>
<id>2</id>
</somechild>
<somechild>
<description>Yep, I want this</description>
<id>41</id>
</somechild>
</someparent>
The <id>
elements are always grouped: all elements with the same <id>
value follow each other in the document. I may have thousands of different <id>s
in a single file. What I want is to find each <somechild>
element that is the first occurrence of its corresponding <id>
group. So my expected result would be:
<somechild>
<description>I want this</description>
<id>98</id>
</somechild>
<somechild>
<description>I want this too</description>
<id>2</id>
</somechild>
<somechild>
<description>Yep, I want this</description>
<id>41</id>
</somechild>
I need a single XPATH command to select all of these "first items in a group". I have tried various combinations of following-sibling
and preceding-sibling
axes, but I can't get it just quite right. I have come very close to what I want to achieve with the following statement:
//someparent/somechild/id[text()=parent::somechild/preceding-sibling::somechild/id[text()]]/parent::somechild
This actually returns all the nodes I don't want, as it selects all the items that are not the first in their group (so it's essentially a perfect negative of what I want!). But for the life of me, I haven't been able to figure out how to reverse the results.
Any help woud be kindly appreciated.