Datalog rule on rdf graph changing the (boolean) value of the fact

Question

I'm trying to write a simple datalog rule to manipulate boolean values in an RDF ontology. I'm trying to use RDFox as a reasoner for now.

RDF ontology is something like this

:citizenVaccinated rdfs:label "vaccinated";
    a :citizen;
    :isCitizenOf : uk


:automatedDecisionMaking rdfs:label "automatedDecisionMaking";
    :hasValue xsd:True.


:basicInformationCheck rdfs:label "basicInformationCheck";
    rdf:type xsd:False.
    #:hasValue xsd:False.

I have written a small datalog rule which says

[:basicInformationCheck, rdf:type, xsd:True]:- [:citizenVaccinated, :isCitizenOf, :UK].

When I query the final graph for the value of basicInformationCheck , I get facts

rdf:type xsd:False
rdf:type  xsd:True

How do we change this to accommodate only the updated facts

I think this question needs some work in order to be really useful and in order for others to provide helpful answers. May I suggest that you rephrase the title so that it is an actual question? — nickform, Mar 16 '22 at 09:50
Another suggestion - could you include the SPARQL query that you are running to get the results shown? — nickform, Mar 25 '22 at 08:28

score 3 · Answer 1 · answered Mar 15 '22 at 13:16

RDFox gives you the ability to specify the 'fact-domain' against which a query is run. This can be IDB (default) for all facts, EDB for explicit facts only, and IDBrepNoEDB for implicit facts only (this is the one you want to use). From version 5.5, these fact domains have been renamed to all, explicit, and derived.

To set them in the shell, simply do:

set query.fact-domain IDBrepNoEDB
select ....

You can also use these in REST on a per-query basis, by specifying the fact-domain URL parameter when answering your query, e.g.:

curl -i -X POST "<user>:<pw>@<server>:<port>/datastores/<datastore_name>/sparql?fact-domain=IDBrepNoEDB" -d "query=SELECT..."

nickform · Answer 2 · 2022-05-21T18:54:17.137

It seems that you're hoping to "change" a value. Something similar is possible but it's important to understand that Datalog cannot change property values that have been asserted, that is, that have been added as explicit facts, only add new facts derived from applying the Datalog rules to the explicit facts. This explains why you get both values in your query results: xsd:False because you added it explicitly and xsd:True because you added a rule that derives that value.

As described in @valerio-cocchi 's answer, facts added by reasoning are modelled as belonging to the derived domain whereas base facts added by loading RDF graphs belong to the explicit domain. While you could restrict your query to one or other of those domains, I think what you want is to continue querying against the domain that contains both types of fact together (the default behaviour) and to instead ensure that the property whose value you want to change is always set by Datalog rules and never asserted explicitly. That way you can achieve a change from false to true when the relevant condition is met. To do this requires a feature of RDFox's Datalog called negation-as-failure.

Using negation-as-failure to model defaults

The following example is loosely based on the Expressing Defaults and Exceptions example from the Common Uses of Rules in Practice section of the RDFox documentation. I highly recommend reading that entire section to get a feel for what rules do.

We'll start with the following explicit fact in Turtle format which just states that Tweety (who has URI :tweety) is a citizen of the UK:

:tweety a :UKCitizen .

We would like to ensure that for each such citizen in our data store, we have a boolean property indicating whether or not they pass a basic information check. This property should be false unless we have an explicit fact to say that the citizen is vaccinated. The following pair of rules achieves this:

[?citizen, :basicInformationCheck, xsd:False] :-
    [?citizen, a, :UKCitizen],
    NOT [?citizen, :isVaccinated, xsd:True] .

[?citizen, :basicInformationCheck, xsd:True] :-
    [?citizen, a, :UKCitizen],
    [?citizen, :isVaccinated, xsd:True] .

If we load the facts and rules above into an RDFox data store and then run the SPARQL query:

SELECT ?citizen ?passesCheck
WHERE {
    ?citizen a :UKCitizen ;
             :basicInformationCheck ?passesCheck .
}

we will see the following answer:

:tweety xsd:False .

If we then add the following triple to the data store:

:tweety :isVaccinated xsd:True

and re-run the query, we will instead see:

:tweety xsd:True .

Other issues

There are a couple of other apparent misunderstandings in your question which I hope it will help to point out.

First, you are using the predicate rdf:type with values of xsd:True and xsd:False. This implies that xsd:True is a class, which is not so. Likewise for xsd:False. I have avoided this in the example given above by using :basicInformationCheck in the predicate position only.

Second, your Datalog rule does not have any variables. This is perfectly valid but highly unusual. It just adds one exact triple if and only if another exact triple is present. The exact same effect could be achieved by ensuring that the base triples contain both of those exact triples or neither of them. Real-world useful Datalog rules will almost always contain variables as shown in the example.

The final (minor) thing to note is that the predicate a is a synonym for rdf:type. Your data uses a mixture of the two. It would be better to use one or the other consistently.

Datalog rule on rdf graph changing the (boolean) value of the fact

2 Answers2

Using negation-as-failure to model defaults

Other issues