I tried looking the SPARQL documentation but couldn't find anything that helps. This is what I need to do. (PathL is pathlength)
Asked
Active
Viewed 466 times
1
-
1SPARQL is the wrong query language for path queries - there are some workarounds, but honestly, those will never be efficient enough. You'll find those approaches if you use the search on StackOverflow. Besides standard SPARQL, Blazegraph (the backend of public Wikidata) has some kind of graph API which you can try: https://github.com/blazegraph/database/wiki/RDF_GAS_API – UninformedUser Feb 25 '21 at 18:27
-
The way I approached this task in the past is to write a SELECT query to export the data and then load them with a network framework ( e.g. NetworkX in Python) or a graph visualisation tool (e.g. Cytoscape). Then the task is performed by the tool/framework. – lps Feb 28 '21 at 11:06
-
@lps I don't see how this would help here? Finding paths between any two Wikidata entities is literally processing the whole dataset. There is no benefit to perform a SELECT query as i) no public endpoint will return all triples via `SELECT` query and ii) the Wikidata dump is free to download. – UninformedUser Feb 28 '21 at 15:23
-
Of course! I missed the mention of Wikidata. Working with such a dataset outside a triplestore is a challenge. What you may try, in this case, is using the [Gremlin language](https://tinkerpop.apache.org/gremlin.html), to specify the graph traversal rules. There are triple stores that offer support for both: SPARQL and Gremlin (see StardogDB for example). Less optimal, if you still want to give Gremlin a go, you may transform the RDF into the TinkerPop graph using a simple library such as [this one](https://pypi.org/project/rdf2gremlin/). – lps Mar 01 '21 at 16:12