0

We are going to have the semantic web. Now we have LOD cloud.

Every data set has its own SPARQL endpoint.

I can query the dataset triples.

How can I query the whole semantic web or LOD?

Wisamx
  • 183
  • 3
  • 12
  • There is a LOD cloud cache hosted by OpenLink: http://lod.openlinksw.com/sparql But I don't know whether this contains really the whole LOD cloud. I mean the datasets in the LOD cloud are changing over time, and this for sure does not happen in this triple store. – UninformedUser Nov 17 '17 at 04:47
  • 1
    Note, the idea of the Semantic Web is not to have a single centralized database, but Linked Data as a concept that allows for traversing via HTTP requests. It's similar to the "normal" Web. There is no single web page, but via HTTP one is able to crawl data in the web. In addition, there is also the concept of federated querying for Semantic Web. – UninformedUser Nov 17 '17 at 04:49
  • 2
    @AKSW - The datasets in the experimental LOD Cloud Cache *quad* store have indeed changed over time, and will continue to do so, though you may not have noticed such changes. (Named Graphs are put to substantial use in this instance, so it absolutely cannot be considered a mere triple store.) Many datasets in the LOD Cloud pictorial are not available as dumps, and much data cleansing is necessary for those that are, so you are correct that the Cache does not include all of those pictured -- which picture still only shows a subset of the accessible LOD Cloud. – TallTed Nov 17 '17 at 16:47

5 Answers5

6

No, there is no such single SPARQL endpoint, because the Semantic Web is decentralized by design. However, SPARQL 1.1 supports federated queries over different SPARQL endpoints using the SERVICE keyword. See https://www.w3.org/TR/sparql11-federated-query/ for reference. More specifically, there is a mention in the literature about how to determine which data sources might be relevant for query answering at Internet scale:

Hartig O., Bizer C., Freytag J.C. (2009) Executing SPARQL queries over the Web of Linked Data. In: Bernstein A. et al. (eds.) The Semantic Web – ISWC 2009. ISWC 2009. Lecture Notes in Computer Science, vol. 5823, pp. 293–309. Heidelberg: Springer. doi: 10.1007/978-3-642-04930-9_19

Leslie Sikos
  • 529
  • 2
  • 7
  • Just as a comment: doing SPARQL queries over the Web of Linked Data will never work resp. scale. Especially doing SPARQL via Linked Data has obvious limitations and I'm pretty sure that not all the data of the LOD cloud is even hosted correctly according to the Linked Data principles. – UninformedUser Nov 17 '17 at 09:25
5

There exists a W3C-owned and (un-?)maintained wiki page with ~60 SPARQL endpoints. Many "last accessed/checked" entries are from 2010. On that page is a link to http://sparqles.ai.wu.ac.at/availability which lists more endpoints and is much more recent and up-to-date.

Read the 2nd paragraphs titled "SPARQL Endpoints" of the blogpost Querying DBpedia with GraphQL for a skeptical view of the state of SPARQL today. Cannot say it any better myself.

Also note that SPARQL permits every endpoint to offer any number of "named GRAPH" constructs that can be queried at that endpoint. So that is another feature more to consider.

knb
  • 9,138
  • 4
  • 58
  • 85
2

There is no central point regarding the notion of a Semantic Web of Linked Data. Instead, like any Super Information Highway, you have major concentration points (hubs or junctions) that enable you to discover routes to a variety of destinations.

Major Semantic Web of Linked Data hubs that we oversee at OpenLink Software include:

  1. DBpedia
  2. DBpedia-Live
  3. URIBurner
  4. LOD Cloud Cache

Remember, the fundamental principle behind Linked Open Data is that hyperlinks (HTTP URIs) function as words in sentences constructed using RDF Language. Thus, you can use the SPARQL Query Language to produce query solutions (tables or graphs) that expose desired routes (e.g., using Property Paths).

Finally, you can also use Federated SPARQL Query (SPARQL-FED) to navigate a Semantic Web of Linked Data.

Examples:

select distinct * 
where { 
        ?s a <http://dbpedia.org/ontology/AcademicJournal> ; 
        rdf:type{1,3} ?o 
       } 

LIMIT 50

Query Solution Document Link.

We are also working on a publicly available Google Spreadsheet that provides additional information related to the kinds of datasets accessible via the LOD Cloud that we maintain.

1

To my knowledge LOD-a-lot is currently the one ongoing effort that gets closest to the vision of querying the whole web of data. And this is obviously done using different means than SPARQL endpoints.

szymon
  • 307
  • 1
  • 3
  • 9
0

It's still a prototype, which means bugs, but one of the aims of wimuQ is to provide a way to query all 539 public SPARQL endpoints + all datasets from LODLaundromat and LODStats, that is more than 600,000 datasets, more than 5 terabytes. As far as I know, it is the most extensive collection datasets accessible from one single place.

For more information, the paper is available here: