1

In a question-answering systems I need to compare the set of questions users have answered to match the sets of questions associated to profiles, to group users in these profiles.

I am trying to find, in a single query, tuples profile;user, where the user have answered all the questions associated to the profile, which I tend to interpret as there is no situation where a question of the profile is not answered by the user (double negation).

Here is a dataset for which the expected answer is :profile1 :Alice :

@prefix :       <http://example/> .

# Alise has answered 2 questions
:alice a :User .
:alice :hasAnswer :a1, :a2 .
:a1 :question :q1 .
:a2 :question :q2 .

# Bob answered only q1
:bob a :User .
:bob :hasAnswer :b1 .
:b1 :question :q1 .

# Carl did not answered anything
:carl a :User .

# Profile 1 is associated to q1 and q2
:profile1 a :Profile .
:profile1 :associatedQuestion :q1, :q2 .

# Profile 2 is associated to q1 and q3
:profile2 a :Profile .
:profile2 :associatedQuestion :q1, :q3 .

This does not work :

PREFIX :       <http://example/>
SELECT ?user ?profile
WHERE {
    ?user a :User .
    ?profile a :Profile .
    FILTER NOT EXISTS {
        ?profile :associatedQuestion ?q .
        FILTER NOT EXISTS {
            ?user :hasAnswer ?a .
            ?a :question ?q .
        }
    }
}

I tried playing with MINUS instead of FILTER NOT EXISTS, changing variable names, etc.

I tried using a different way, by selecting users who have answered the same number of questions as the one in the profile, but this seems to be overkill and I have doubts on the performance :

PREFIX :       <http://example/>
SELECT ?profile ?numberOfQuestions ?user (COUNT(?a) AS ?numberOfAnswers)
WHERE {
    ?profile a :Profile .
    ?profile :associatedQuestion ?question .
    OPTIONAL {
        ?user :hasAnswer ?a .
        ?a :question ?question .
    }

    {
        SELECT ?profile (COUNT(?question) AS ?numberOfQuestions)
        WHERE {
            ?profile a :Profile .
            ?profile :associatedQuestion ?question .
        }
        GROUP BY ?profile
    }
}
GROUP BY ?profile ?numberOfQuestions ?user
HAVING (?numberOfAnswers = ?numberOfQuestions)

Is there a way to achieve the desired result using a double negation ? which query pattern would give the best performance ? (I am working in Jena Fuseki, latest release).

Thanks

ThomasFrancart
  • 470
  • 3
  • 13
  • 1
    Query 1 works for me in Jena Fuseki 3.12.0. Windows 10, in-memory dataset. – Stanislav Kralin Jul 01 '19 at 20:59
  • 1
    works for me as well with Jena 3.12.0 CLI tool `sparql --data ... --query ...` - so, I guess you have some issue with your Fuseki and or dataset config? Do you use some graphs or other config setup? – UninformedUser Jul 02 '19 at 03:38
  • Ha you are right, I don't know where I messed up, maybe in the variable names or as you suggest in the dataset configuration. Query 1 with double negation does work. Thanks for your time ! – ThomasFrancart Jul 03 '19 at 07:37

1 Answers1

0

After all, the query with double negation does work fine.

ThomasFrancart
  • 470
  • 3
  • 13