0

Following on from this question about navigating collections using pos:

In eXist 4.7 I have a collection in myapp/data/ which contains thousands of TEI XML documents. I use the following solution from Martin Honnen to get the document before and after a certain document

let $data := myapp/data
let $examples := $data/tei:TEI[@type="example"]
for $example at $pos in $examples
where $example/@xml:id = 'TC0005'
return (
    $examples[$pos - 1],
    $example
    $examples[$pos + 1]
    )

With this I would have expected $examples[$pos - 1] to produce document 'TC0004' and $examples[$pos + 1] to produce 'TC0006' (based on the sort order seen in eXide collection navigation view for example). They do not, producing the inverse instead.

Honnen and Michael Kay responded that

ordering of documents within a collection is very much processor-dependent

Applying an order by $example/@xml:id ascending clause did not change the result for the better.

So, the question is how can I impose an alpha-numeric order on $data?

Many thanks.

jbrehr
  • 775
  • 6
  • 19

2 Answers2

3

It seems at the XQuery level you can change let $examples := $data/tei:TEI[@type="example"] to

let $examples := sort($data/tei:TEI[@type="example"], (), function($e) { $e/@xml:id })

(assuming the XQuery/XPath 3.1 higher-order sort function is available) or to

let $examples := for $e in $data/tei:TEI[@type="example"] order by $e/@xml:id return $e

using the order by clause.

I don't know whether exist-db has some way to impose an order during the creation or during the selection of a collection.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Applying the `sort`function did not have an effect in eXist. The second option works, although it feels as though it adds some extra processing time. – jbrehr Aug 19 '19 at 16:14
  • Please could you open an issue for `fn:sort` if it is not working – adamretter Sep 13 '19 at 17:19
1

Based on experience with older versions of eXist, the $pos value while going through a loop is not the sorted position order. It is the position while going through.

What you first want to do is create an ordered list, then get the three items from the list you're looking for.

let $data := myapp/data[tei:TEI/@type eq 'example']
let $examples := for $e in $data order by $e/@xml:id ascending return $e
let $pos := index-of($examples/@xml:id, 'TC0005')
return if (count($pos) eq 1) then (
  if ($pos gt 1) then $examples[$pos - 1] else (),
  $examples[$pos]
  $examples[$pos + 1]
) else ()

A potential problem with this approach is that you'll have to sort all items every time. Creating a sorted cached list may alleviate this problem and would also allow for a much more efficient query, where you can use preceding-sibling and following-sibling from the query result.

Another potential solution, if the naming convention for the IDs is consistent, would be to query the before and after IDs.

The check to see if there is one item in $pos is to prevent cases where @xml:id is not unique (yes, that would be against the spec, but it happens in real world data) or no item exists. Keep in mind that index-of returns an array of indexes - 0 or more.

westbaystars
  • 151
  • 1
  • 4