1

I have a small graph in a Jena/Fuseki store that I query using rdflib/SPARQLWrapper via CONSTRUCT to build a smaller graph that contains all the info I need.

The resulting graph is a RDFLib Graph with 56 triples in total. Looking at the Jena logs the queries to build the graph don't take more than a few milliseconds.

Now when I execute more queries on this graph, the first time the query is very slow. For example this simple SELECT:

SELECT DISTINCT ?o
WHERE {
    ?f threems:predicate ?o .
}

Takes more than 1 second. Subsequent similar queries take a fraction of this... I tried the rdflib .objects() method, same performance.

One second is not much, but given that I need to use this small graph a few times within the span of a request, I'd be very glad to bring it down.

I'm not sure how to optimize this, as I don't think the query is relevant, if it has to do with some preloading or parsing, it's not clear what is the bottleneck here. Maybe someone has an idea/suggestion ?

fatz
  • 698
  • 7
  • 14
  • similar queries will benefit from caching. You can't optimize a query with a single triple pattern. Your query touches the `pso` index, that's the whole optimization a triple store can do here, and Jena Fuseki (probably backed by Jena TDB) does this by default. – UninformedUser Jan 27 '20 at 18:04
  • How long does that query take in the Fuseki logs? is it the first query after the server starts? Try without the DISTINCT - it may be hiding a lot of results to be skipped. – AndyS Jan 27 '20 at 23:02
  • in the Fuseki log, it takes about 3ms, and no it's not the first query after the server starts – fatz Jan 28 '20 at 15:20

0 Answers0