In a question-answering systems I need to compare the set of questions users have answered to match the sets of questions associated to profiles, to group users in these profiles.
I am trying to find, in a single query, tuples profile;user
, where the user have answered all the questions associated to the profile, which I tend to interpret as there is no situation where a question of the profile is not answered by the user (double negation).
Here is a dataset for which the expected answer is :profile1 :Alice
:
@prefix : <http://example/> .
# Alise has answered 2 questions
:alice a :User .
:alice :hasAnswer :a1, :a2 .
:a1 :question :q1 .
:a2 :question :q2 .
# Bob answered only q1
:bob a :User .
:bob :hasAnswer :b1 .
:b1 :question :q1 .
# Carl did not answered anything
:carl a :User .
# Profile 1 is associated to q1 and q2
:profile1 a :Profile .
:profile1 :associatedQuestion :q1, :q2 .
# Profile 2 is associated to q1 and q3
:profile2 a :Profile .
:profile2 :associatedQuestion :q1, :q3 .
This does not work :
PREFIX : <http://example/>
SELECT ?user ?profile
WHERE {
?user a :User .
?profile a :Profile .
FILTER NOT EXISTS {
?profile :associatedQuestion ?q .
FILTER NOT EXISTS {
?user :hasAnswer ?a .
?a :question ?q .
}
}
}
I tried playing with MINUS instead of FILTER NOT EXISTS, changing variable names, etc.
I tried using a different way, by selecting users who have answered the same number of questions as the one in the profile, but this seems to be overkill and I have doubts on the performance :
PREFIX : <http://example/>
SELECT ?profile ?numberOfQuestions ?user (COUNT(?a) AS ?numberOfAnswers)
WHERE {
?profile a :Profile .
?profile :associatedQuestion ?question .
OPTIONAL {
?user :hasAnswer ?a .
?a :question ?question .
}
{
SELECT ?profile (COUNT(?question) AS ?numberOfQuestions)
WHERE {
?profile a :Profile .
?profile :associatedQuestion ?question .
}
GROUP BY ?profile
}
}
GROUP BY ?profile ?numberOfQuestions ?user
HAVING (?numberOfAnswers = ?numberOfQuestions)
Is there a way to achieve the desired result using a double negation ? which query pattern would give the best performance ? (I am working in Jena Fuseki, latest release).
Thanks