1

Is it possible to guarantee that the results of a transitive query in SPARQL come back in the order in which they were walked?

So, given some simple data:

<http://example.com/step0> ex:contains <http://example.com/step1>
<http://example.com/step1> ex:contains <http://example.com/step2>
<http://example.com/step2> ex:contains <http://example.com/step3>

(in practice the relation could repeat many more times)

Query (using sparql 1.1):

SELECT ?parent
WHERE {
    ?parent ex:contains* <http://example.com/step3>
}

Such that you would always get back [step0, step1, step2]. When trying this in jena I get consistent but randomly ordered results.

Alternatively, it would be fine if I could get back both the parent and child in the transitive walk so that I could re-order it outside, but I don't know how to both bind ?parent ex:contains* <http://example.com/step3> and get back the objects of the intermediate relations without writing a very slow nested query with filtering.

Ben Pennell
  • 449
  • 1
  • 3
  • 13
  • Your query does only return one result which is `http://example.com/step2` , thus, I don't understand why you say that it works in Jena but the results are ordered randomly. I can't see that you're already solving the concept of transitivity here as no property path like `ex:contains*` is used – UninformedUser Apr 05 '17 at 00:41
  • Sorry about that, I left out the most important single character in the question. It was supposed to be ex:contains*, I've updated the question – Ben Pennell Apr 05 '17 at 03:56

3 Answers3

3

For simple linear paths, you could use the number of hops as a measure for ordering:

PREFIX  ex:   <http://example.com/>

SELECT  ?start
WHERE
  { ?start (ex:contains)+ ?mid .
    ?mid (ex:contains)* ex:step3
  }
GROUP BY ?start
ORDER BY DESC(COUNT(?mid))

Output:

------------
| start    |
============
| ex:step0 |
| ex:step1 |
| ex:step2 |
------------
UninformedUser
  • 8,397
  • 1
  • 14
  • 23
  • After some testing it looks like this is working, thank you! The only issue is that the performance degrades quite significantly with the size of the graph (appears to be linear). However, if I reverse the conditions in the WHERE clause to "{?mid (ex:contains)* ex:step3 . ?start (ex:contains)+ ?mid}" the ordering is still correct but the performance does not seem to drop off with a populated graph anymore. It is still about 2-3 times slower than the original query, but that is acceptable versus having to perform multiple queries. – Ben Pennell Apr 05 '17 at 14:48
1

Is it possible to guarantee that the results of a transitive query in SPARQL come back in the order in which they were walked?

No (the SPARQL 1.1 standard does not define order)

Here, the fixed object and the fact the data is a linear path happen to mean there is a natural walk order.

As the Apache Jena SPARQL execution is deterministic (in this case), it'll come out in some order only because the internal collection of results retains order. Not all Jena versions do this - it has changed over time.

For other, non-linear, paths nothing is certain. Data is stored using hash maps.

AndyS
  • 16,345
  • 17
  • 21
  • Thank you, I was having difficulty finding anything that explicitly mentioned if transitive querying had any ordering mechanism, so it is good to have it confirmed this confirmed. – Ben Pennell Apr 05 '17 at 14:50
0

Given your data example, try:

SELECT ?parent ?child ?subchid
WHERE {
    ?parent ex:contains <http://example.com/step3> .
    ?child ex:contains ?parent .
    OPTIONAL { ?subchild ex:contains ?child . }
}

If not all ex:contains relationships go three levels, you may need to do an OPTIONAL pattern match.

scotthenninger
  • 3,921
  • 1
  • 15
  • 24
  • Thanks for your answer, I accidentally left off the * on `ex:contains*` which makes the query transitive. I have updated the question. The depth of the transitive relationship in my case goes to an unlimited depth. I'll need to take a look at how OPTIONAL is used, I had seen it in other answers but had gotten the impression that was a Virtuoso specific syntax. – Ben Pennell Apr 05 '17 at 04:06
  • For the update in your OP, see the response by @ASKW. For an example of using `OPTIONAL`, I edited the above response. – scotthenninger Apr 05 '17 at 17:25