1

I'm creating a local cache for some data from Wikidata, to make sure I have fast autocompletion for tags. I would like to update the data once a week but only if the SERVICE clause works.

Basically I do:

INSERT { GRAPH <http://my.data/graph/wikidata> {
    ?concept wdt:P902 ?hls ;
        rdfs:label ?label ;
        schema:description ?description .
}} WHERE {
    SERVICE <https://query.wikidata.org/sparql> {
        ?concept wdt:P902 ?hls .
        ?concept rdfs:label ?label .
        ?concept schema:description ?description .
        FILTER (lang(?description) = "en")
        FILTER (lang(?label) = "en" || lang(?label) = "de" || lang(?label) = "fr" || lang(?label) = "it")
}}

Now I thought I can do a DELETE/INSERT and I delete all data first:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>

WITH <http://my.data/graph/wikidata>
DELETE { ?s ?p ?o }
INSERT {
    ?concept wdt:P902 ?hls ;
        rdfs:label ?label ;
        schema:description ?description .
} WHERE {
    SERVICE <https://query.wikidata.org/sparql> {
        ?concept wdt:P902 ?hls .
        ?concept rdfs:label ?label .
        ?concept schema:description ?description .
        FILTER (lang(?description) = "en")
        FILTER (lang(?label) = "en" || lang(?label) = "de" || lang(?label) = "fr" || lang(?label) = "it")
    }
    ?s ?p ?o
}

But like this it looks like I get 400 back from Wikidata. I do similar DELETE/INSERTs without SERVICE but I can't see why this would not work like this, as I do not bind any variables between the existing graph and the SERVICE query.

Anyone sees what's going wrong here? Basically I would like to wipe the existing graph but not end up with an empty graph in case Wikidata is down.

Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Adrian Gschwend
  • 664
  • 8
  • 16
  • AFAIK, you can execute multiple SPARQL 1.1 Update statements in a batch divided by `;`. Given that, I would suggest to either i) delete the graph via `DELETE WHERE {?s ?p ?o}` or maybe more efficient, use a [dedicated statement](https://www.w3.org/TR/sparql11-query/#rClear) like ii) `CLEAR ` – UninformedUser Mar 04 '18 at 14:10
  • Interesting idea, I could not find that in the spec, any pointers? Also is that still atomic that way? – Adrian Gschwend Mar 04 '18 at 15:42
  • Reference is in the [grammar](https://www.w3.org/TR/sparql11-query/#rUpdate) and informally in the first paragraph [here](https://www.w3.org/TR/sparql11-update/#updateLanguage) – UninformedUser Mar 04 '18 at 16:52
  • citing the relevant part: *"A request is a sequence of operations and is terminated by EOF (End of File). Multiple operations are separated by a ';' (semicolon) character. A semicolon after the last operation in a request is optional. Implementations must ensure that the operations of a single request are executed in a fashion that guarantees the same effects as executing them sequentially in the order they appear in the request."* – UninformedUser Mar 04 '18 at 16:55
  • Regarding atomicity, I don't know - but the second paragraph says: *"If multiple operations are present in a single request, then a result of failure from any operation must abort the sequence of operations, causing the subsequent operations to be ignored."* – UninformedUser Mar 04 '18 at 16:55
  • Thanks, I think that would not help in my case, the CLEAR is likely to succeed so I would still end up with an empty graph in case Wikidata is down. – Adrian Gschwend Mar 04 '18 at 19:05
  • Ok, I get it. Can you elaborate more when you get the 400 error? What happens if you do `WHERE {{?s ?p ?o } UNION { SERVICE ... }}`? – UninformedUser Mar 05 '18 at 04:08
  • I wish you would have done that as an answer, then I could select it as correct answer :) Thanks, that works! Will add the working query once I can answer it myself – Adrian Gschwend Mar 05 '18 at 09:39

1 Answers1

2

Thanks to user AKSW for the hint with the UNION, this is the query that works:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>


DELETE { GRAPH  <http://data.alod.ch/graph/wikidata> { ?s ?p ?o }}
INSERT { GRAPH  <http://data.alod.ch/graph/wikidata> {
    ?concept wdt:P902 ?hls ;
        rdfs:label ?label ;
        schema:description ?description .
}} WHERE {
{
    SERVICE <https://query.wikidata.org/sparql> {
        ?concept wdt:P902 ?hls .
        ?concept rdfs:label ?label .
        ?concept schema:description ?description .
        FILTER (lang(?description) = "en")
        FILTER (lang(?label) = "en" || lang(?label) = "de" || lang(?label) = "fr" || lang(?label) = "it")
    }
}
UNION
{
    GRAPH  <http://data.alod.ch/graph/wikidata> {
      ?s ?p ?o
    }
}
}
Adrian Gschwend
  • 664
  • 8
  • 16
  • Perhaps you do not need to drop all 16 000 entities and then load them again, but only recently updated ones. Something like this: ​`?concept wdt:P902 ?hls. ?concept schema:dateModified ?date. FILTER (?date > "2018-02-28T00:00:00Z"^^xsd:dateTime)`. – Stanislav Kralin Mar 05 '18 at 14:04
  • Cool tnx, did not know that – Adrian Gschwend Mar 05 '18 at 20:09