0

Using the count(preceding-sibling::*) XPath expression one can obtaining incrementing counters. However, can the same also be accomplished in a two-levels deep sequence?

example XML instance

<grandfather>
    <father>
        <child>a</child>
    </father>
    <father>
        <child>b</child>
        <child>c</child>
    </father>
</grandfather>

code (with Saxon HE 9.4 jar on the CLASSPATH for XPath 2.0 features)

Trying to get an counter sequence of 1,2 and 3 for the three child nodes with different kinds of XPath expressions:

    XPathExpression expr = xpath.compile("/grandfather/father/child");
    NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
    for (int i = 0 ; i < nodes.getLength() ; i++) {
        Node node = nodes.item(i);
        System.out.printf("child's index is: %s %s %s, name is: %s\n"
                          ,xpath.compile("count(preceding-sibling::*)").evaluate(node)
                          ,xpath.compile("count(preceding-sibling::child)").evaluate(node)
                          ,xpath.compile("//child/position()").evaluate(doc)
                          ,xpath.compile(".").evaluate(node));
    }

The above code prints:

child's index is: 0 0 1, name is: a
child's index is: 0 0 1, name is: b
child's index is: 1 1 1, name is: c

None of the three XPaths I tried managed to produce the correct sequence: 1,2,3. Clearly it can trivially be done using the i loop variable but I want to accomplish it with XPath if possible. Also I need to keep the basic framework of evaluating an XPath expression to get all the nodes to visit and then iterating on that set since that's the way the real application I work on is structured. Basically I visit each node and then need to evaluate a number of XPath expressions on it (node) or on the document (doc); one of these XPAth expressions is supposed to produce this incrementing sequence.

Marcus Junius Brutus
  • 26,087
  • 41
  • 189
  • 331

2 Answers2

1

Use the preceding axis with a name test instead.

count(preceding::child)

Using XPath 2.0, there is a much better way to do this. Fetch all <child/> nodes and use the position() function to get the index:

//child/concat("child's index is: ", position(), ", name is: ", text())
Jens Erat
  • 37,523
  • 16
  • 80
  • 96
  • both these suggestions don't seem to work; I updated the post. `preceding::child` in particular produces the same sequence as `preceding::*` – Marcus Junius Brutus Jul 02 '13 at 22:21
  • You used `preceding-sibling::child` in your first query, noch `preceding::child`. My second query produces the whole output you need in one go, you cannot apply it as you tried. `position()` returns the position _within the current result set_. Anyway: Your approach has a runtime complexity of O(n^2) in the number of child nodes and O(n) in the number of queries sent; with my one you're at O(n) for the nodes and have exactly one query. – Jens Erat Jul 03 '13 at 08:32
1

You don't say efficiency is important, but I really hate to see this done with O(n^2) code! Jens' solution shows how to do that if you can use the result in the form of a sequence of (position, name) pairs. You could also return an alternating sequence of strings and numbers using //child/(string(.), position()): though you would then want to use the s9api API rather than JAXP, because JAXP can only really handle the data types that arise in XPath 1.0.

If you need to compute the index of each node as part of other processing, it might still be worth computing the index for every node in a single initial pass, and then looking it up in a table. But if you're doing that, the simplest way is surely to iterate over the result of //child and build a map from nodes to the sequence number in the iteration.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • In my code I need to visit a number of nodes and compute for each node, a set of ~30 XPath expressions that are kept in a config file and then bundle together all these ~30 returned values in a structure that reflects the processing outcome of that node. Hence the outer loop and the separate XPath expressions on each loop iteration. If I fully get your and Jens' suggested approach I would need an alternating sequence with a "period" of 30 different values or a way to return 30-tuples from XPath which I don't know if it's at all possible. – Marcus Junius Brutus Jul 03 '13 at 16:09