3

I am wondering how group patterns are evaluated in SPARQL. My assumption was that each group pattern is evaluated separately, then solution bindings from groups are joined together. However, it seems not to be the case.

Let's take this example data:

:film1 :hasDirector :director1.

Let's have this example following query:

select * where {
  {?a :hasDirector ?c.} 
  {optional {?c :fromCountry ?e}}.
}

I would assume that each group will be evaluated separately and then results from both groups will be joined. in terms of relational algebra, it would look like first_group INNER-JOIN second_group. However, it's not the case... Evaluating each group separately; the first group pattern yeilds the solution: ?a = :film1, ?c = :director1. The second triple pattern does not yield any solution. Now if my assumption were correct, joining results wouldn't return any solution. However, this query returns one solution with ?a = :film1 ,?c = :director1, ?e unbound.

This result is the same as if there were no groups used {}, also the same as if the following query was executed:

select * where {
  ?a :hasDirector  ?c.
  optional {?c :fromCountry ?e}.
}

The last query (relationally again to facilitate understanding) first_group LEFT-OUTER-JOIN second_group.

How group patterns are evaluated in SPARQL? What am I missng here?

PS. I am using GraphDB for testing...

EDIT1:

Now trying to get algebra of queries via Jena ARQ... seems to confirm what I expected?

This is what I get from Jena ARQ for the first query algebra:

   (join
      (bgp (triple ?a <http://www.example.com/hasDirector> ?c))
      (leftjoin
        (table unit)
        (bgp (triple ?c <http://www.example.com/fromCountry> ?e))))

Second query:

(leftjoin
  (bgp (triple ?a <http://www.example.com/hasDirector> ?c))
  (bgp (triple ?c <http://www.example.com/fromCountry> ?e)))

EDIT2:

Jena gives the same results of GraphDB for the first query though algebra looks as I just showed.

EDIT3:

Could this be the reason? treatment of unbound values (as pre null in relational DB) in joins is quite strange. See Appendix C here.

EDIT4:

Seems the problem is as mentioned in EDIT3 adding FILTER (BOUND (?c)) in the first query will give the expected results!

select * where {
      {?a :hasDirector ?c.} 
      {optional {?c :fromCountry ?e} FILTER (BOUND (?c))}.
    }

But now again... What should the default behaviour be when two consequent group patterns happen? joiing them? disregarding this unbound (null) issue.

Median Hilal
  • 1,483
  • 9
  • 17
  • 1
    The `optional` aka LEFT JOIN does not reduce the solutions from the left side, so in both cases, it can only add bindings for `?e` for the current solution from the left part (solutions are comming from either an evaluation of :hasDirector pattern for the first case or from the evaluation of the empty `{}` bgp, which has a single soluion with no bindings, for the second case) – Damyan Ognyanov May 10 '19 at 07:58
  • So you are saying: 1. In the first query, (no groups used), the left-hand side for optional is an empty group pattern {} which has a single solution with no bindings. Joining this solution with the first group would produce my "unexpected" result. – Median Hilal May 10 '19 at 09:04
  • 2. My assumption about joining groups is correct? (groups like `{first_group} {second_group}` are evaluated internally and then their bindings are joined)? – Median Hilal May 10 '19 at 09:06

1 Answers1

2

You say:

The second triple pattern does not yield any solution

That is correct. But the second group, {optional {...}}, does yield a solution. That's because { OPTIONAL { A } } is equivalent to { {} OPTIONAL { A } } which is equivalent to {} if A has no solutions. The empty group {} always produces one solution that does not bind any variables, also known as the empty solution.

So your first query is a join of two one-solution sequences. On the left-hand side is the solution to the :hasDirector triple pattern. On the right-hand side is the empty solution. The cross product produces only one combination; the invisible join condition removes any combination with clashing variable bindings, but there are no clashes here, so we keep the single combination. So, the result is that one binding you see.

Your second query is different from your first query. Its basic structure is { {TP1} OPTIONAL {TP2} }. So the left join is now between the two triple patterns, and no implicit extra empty group is inserted before the OPTIONAL.

In your edit 3 you added a filter condition to the second group that evaluates to false on the empty binding. So the empty binding is removed from the solution sequence, and now the second group actually has no results. The join is now between a one-solution sequence and a zero-solution sequence, which trivially results in no solutions. This explains your edit 3.

Unbound: SPARQL doesn't have NULL. “Unbound” in SPARQL is simply the condition of a variable not being bound to any value at all in a particular solution. SQL has rows and columns, so you have cells, and cells always have a value, but the value can be the special value NULL. SPARQL has rows but no columns; the “columns” that you see in a SELECT result are only introduced at the very end for presentation purposes. but play no role during query evaluation. In each row (a.k.a. solution), zero or more variables are bound, that is, they are assigned to a value. And any other variables (an infinite number) are unbound.

cygri
  • 9,412
  • 1
  • 25
  • 47
  • Thank you for the explanation... However, to answer my main question which, as you could see, developed to these long edits... – Median Hilal May 10 '19 at 09:20
  • ... Is my assumption about joining group patterns is correct? (groups like `Where{{first_group}. {second_group}}` are evaluated internally and then their outcoming bindings are joined). Can you please refer me to some resource? – Median Hilal May 10 '19 at 09:20
  • 1
    @MedianHilal if you're asking about whether the evaluation is bottom-up, then the answer is *yes*. Same holds also for subqueries for example which are also evaluated first and then "combined" with the outer part of the query. So yes, group graph patterns are evaluated separately. By the way, a slightly similar question was asked some years ago: https://stackoverflow.com/questions/35314331/how-group-graph-pattern-work-in-sparql - you should maybe also have a look at the second comment from Andy below his answer where he also mentions things like entailment and filter eval and scope – UninformedUser May 10 '19 at 09:58
  • 1
    @MedianHilal Yes, evaluation of groups is bottom-up (or inside-out as I prefer to say). This becomes clear when reading the formalism in the SPARQL specification, but you'd more or less have to go through the whole thing until a clear picture emerges. A query produces a parse tree, which is translated to an algebra operation tree, whose evaluation is defined bottom-up. – cygri May 10 '19 at 11:23