2

I'm doing small task on Sparql Query. I want to get the number of entities and number of instances. I have basic knowledge of Sparql and rdf. So I wrote sparql query to get the number of entities but i'm not 100% sure it's right. The endpoint i'm using is Dbpedia. Here's the query.

#Number of Entities

SELECT  (count(?entity) AS ?Entities)
                          WHERE{   ?entity rdf:type ?type.
 }
-----------
Output:
113715893

The output above me gives me big number. I'm just wondering is that the right query to get the number of entities?

Also I have to get the number of Instances. I'm not sure what 'instances' means. I assume that is the subclass or something. Can anyone help me out with the task?

Aziz Mumtaz
  • 95
  • 1
  • 6

1 Answers1

2

Hey the problems with the terms entity and instance is they are used often in different meanings. I assume Entity means every uri that can be an subject. While instance means every entity which is an instance of an owl:Class.

For the entities the query would be:

SELECT  (count(distinct ?entity) AS ?Entities)
                      WHERE{   ?entity ?p ?o}

For instances i would write the following query:

select distinct count(distinct ?instance) where {?instance a ?class . ?class a owl:Class} 

Maybe you mention the distinct before the variable i want to count? This is very important for you. Because to stick with your try an entity can have multiple types. For each of this types you will get an binding for the combination of entity & type variable. This at least leads to the fact that you will count the entity for each type you found in your query. So an entity with two types is counted twice. But I assume you want to count the entity only once - so you need to use the distinct keyword for the variable you want to count. This ensures that you only count different entities that are bound to this variable.

Bierbarbar
  • 1,399
  • 15
  • 35
  • @Bierbarber, I ran your first query for number of entities, the output shows nothing. Although the second query(instances) gave me the result – Aziz Mumtaz Jul 13 '18 at 11:13
  • 1
    Just tried it out. The Problem is the public dbpedia endpoint is limited (number of returned entities and computation time). So you only get the number of bindings counted until the computation limit is reached when you call count without distinct. If you set distinct the distinct operation already times out the limited computation time so you won't get any result because the distinct had no result because of the timeout. – Bierbarbar Jul 13 '18 at 12:21
  • I see, Thanks for letting me know. I appreciate your help – Aziz Mumtaz Jul 13 '18 at 12:42
  • Just one more thing, Is there a way to get the list of datasets in Dbpedia Endpoint? (Not count) – Aziz Mumtaz Jul 13 '18 at 12:45