0

I use the SPARQLWrapper module to launch a query to a virtuoso endpoint and get the result.

The query always return a maximum of 10000 results

Here is the python script:

from SPARQLWrapper import SPARQLWrapper, JSON 

queryString = """ 
SELECT DISTINCT ?s
WHERE {
    ?s ?p ?o .
}
"""


sparql = SPARQLWrapper("http://localhost:8890/sparql")
sparql.setQuery(queryString)
sparql.setReturnFormat(JSON)

res = sparql.query().convert()

# Parse result
parsed = []
for entry in res['results']['bindings']:
    for sparql_variable in entry.keys():
        parsed.append({sparql_variable: entry[sparql_variable]['value']})

print('Query return ' + str(len(parsed)) + ' results')

When I lauch the query with

SELECT count(*) AS ?count

I get the right number of triples : 917051.

Why the SPARQLWrapper module limit the number of result to 10000 ?

How do I get all the results ?

John Doe
  • 354
  • 2
  • 10
  • What do you mean by "launch directly"? Usually Virtuoso has a default limit set in the `virtuoso.ini` file - so at first, you should check your configuration. – UninformedUser Feb 24 '17 at 11:26
  • when a launch the query with a count into the conductor interface of virtuoso, I get the right number. I change the parameters in the .ini file and it worked, thanks ! – John Doe Feb 24 '17 at 13:09

2 Answers2

0

The answer is to adjust the Virtuoso configuration file, as documented. Specifically for this case, you need to increase the ResultSetMaxRows in the [SPARQL] stanza.

The limit is not in SPARQLWrapper. You would see the same limit if you did the full SELECT (instead of the COUNT, which only delivers 1 row) through the SPARQL endpoint, Conductor, or any other interface.

TallTed
  • 9,069
  • 2
  • 22
  • 37
0

The 10000 results is set by the data owner via the item ResultSetMaxRows in the virtuoso.ini, to protect the data.
If not, anyone can use a simple sparql query select * where {?s ?p ?o} to get all the data which may cost the data owner a lot of time and money.

YJ. Yang
  • 978
  • 2
  • 11
  • 19