Paginating SPARQL query results from AWS Neptune with boto3 or aws-cli

Question

I would like to paginate results of a SPARQL query to AWS Neptune. So far I've been using the HTTP API to query AWS Neptune, like

curl -X POST --data-binary 'query=select ?s ?p ?o where {?s ?p ?o} limit 10' https://your-neptune-endpoint:port/sparql

I haven't found an option to paginate results from the HTTP API, but the aws-cli and python boto3 library mention marker and maxrecords parameters to enable pagination. However, I do not understand how I can query AWS Neptune via aws-cli or boto3 library, there are almost no examples/samples in the documentation.

how do I connect to the cluster?
how do I authenticate?
how do I issue the query as simple as select ?s ?p ?o where {?s ?p ?o}?
how do I paginate the results?

Any help is highly appreciated.

score 1 · Answer 1 · answered Jul 19 '22 at 00:08

You should be able to continue using the SPARQL Protocol request (HTTP API). You're already limiting the response to 10 results. You can add an OFFSET 10 clause after LIMIT 10 to get the second page of 10. Then OFFSET 20, and so on.

Be aware that while this may work consistently for you, SPARQL does not guarantee that the results will be in a stable order without an explicit ORDER BY clause. Adding ORDER BY is likely to make the query more expensive and take longer to execute, but it will ensure the results you get are consistent (except for any data that is updated in between your API calls to fetch each page of results).

Thanks for mentioning OFFSET. I forgot to mention in my question that I'm looking for a solution without the usage of OFFSET, LIMIT and ORDER BY, exactly because of the performance. Do you know if Neptune's API has any feature for pagination? — npobedina, Jul 19 '22 at 15:00

Paginating SPARQL query results from AWS Neptune with boto3 or aws-cli

1 Answers1