2

I m trying to make a SPARQL query that returns the number of distinct values of each data property of a Turtle file. I would like to know what the name of each value is and how many time each were repeated. I have created a simple ontology to test:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix uni: <http://www.example.com/university#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@base <http://www.example.com/university> .

<http://www.example.com/university> rdf:type owl:Ontology .

#################################################################
#    Classes
#################################################################

###  http://www.example.com/university#Lecturer
:Lecturer rdf:type owl:Class ;
      rdfs:subClassOf :Person .


###  http://www.example.com/university#Person
:Person rdf:type owl:Class .


#################################################################
#    Individuals
#################################################################
###  http://www.example.com/university#Lecturer1
:Lecturer1 rdf:type owl:NamedIndividual ,
                   :Lecturer ;       
       :first_name "John"^^xsd:string ;
       :last_name "Coles"^^xsd:string ;
       :staffID "234"^^xsd:int .


  ###  http://www.example.com/university#Lecturer2
  :Lecturer2 rdf:type owl:NamedIndividual ,
                :Lecturer ;
       :first_name "John"^^xsd:string ;
       :last_name "Doe"^^xsd:string ;
       :staffID "89387"^^xsd:int .

     ###  http://www.example.com/university#lecturer3
     :lecturer3 rdf:type owl:NamedIndividual ,
                :Lecturer ;
       :first_name "John"^^xsd:string ;
       :last_name "Doe"^^xsd:string ;
       :staffID "7658"^^xsd:int .


  #################################################################
  #    General axioms
  #################################################################

  [ rdf:type owl:AllDisjointClasses ;
     owl:members (
            :Lecturer
          )
  ] .

And this is the SPARQL query I m using:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX uni: <http://www.example.com/university#>
select distinct ?ind ?property ?value (count(?value) as ?noOfDistinctValues) where {
   ?ind rdf:type uni:Lecturer .
   ?ind ?property ?value .
   ?property a owl:DatatypeProperty
}
group by ?ind ?property ?value

and here is the results (The counts does not make sense to me) and I m sure there is something wrong with my query:

    ind        property     value     noOfDistinctValues
------------------------------------------------------------
  lecturer2    staffID      89387         6
  lecturer2    first_name   John          8
  lecturer2    last_name    Doe           8
  lecturer1    staffID      234           6
  lecturer1    first_name   John          8
  lecturer1    last_name    Coles         8
  lecturer3    staffID      7658          6
  lecturer3    first_name   John          8
  lecturer3    last_name    Doe           8

What I am looking for:

property     value     noOfDistinctValues 
------------------------------------------
staffID      89387         1               
first_name   John          3                
last_name    Doe           2                
staffID      234           1                
last_name    Coles         1                
staffID      7658          1                

I m not even sure what is count that its being returned. I m also new to Ontology and SPARQL

I appreciate your help greatly

DjSh
  • 2,776
  • 2
  • 19
  • 32
  • First, remove `?ind` and `?value` from both `group by` and `select`. – Stanislav Kralin Jan 17 '20 at 22:09
  • If I remove the `?value`, the numbers will be even more messed up and I need to see what values are. I could remove ?ind though – DjSh Jan 17 '20 at 22:34
  • 1
    your question is a bit misleading ... you want the number of occurrences of each value per property? Or the total number of distinct values per property? Both are obviously different things. You need subqueries – UninformedUser Jan 18 '20 at 04:26
  • For example, to get the number of occurrences per property per value, you have to count the `?ind` of course: `select ?property ?value (count(?ind) as ?noOfValueOccurrences) where { ?ind rdf:type uni:Lecturer . ?ind ?property ?value . ?property a owl:DatatypeProperty } group by ?property ?value` – UninformedUser Jan 18 '20 at 04:27
  • 1
    the number of distinct values per property: `select ?property (count(distinct ?value) as ?noOfTotalValuesForProperty) where { ?ind rdf:type uni:Lecturer . ?ind ?property ?value . ?property a owl:DatatypeProperty } group by ?property` – UninformedUser Jan 18 '20 at 04:29
  • you have to combine both – UninformedUser Jan 18 '20 at 04:31
  • @AKSW thank you very much for your help. So your first query returns the numbers that are not right like 86, 78 ,.. .Your second query returns the number of distinct value per property. I would like to know what the name of values are and howmany times each were repeated, You were right. My question was not clear. I m editing it now – DjSh Jan 18 '20 at 13:47
  • sorry, but my first query works as expected - I tried it now. By the way, your data is invalid turtle. `@base` would need the usage of `<#Lecturer>` etc. When you use `:Lecture` you have to define `@prefix : <...> .` Also, the `uni:` namespace is different in data and query. But in any case, my query works as expected. You have to copy and paste it, I tried it with Apache Jena from CLI – UninformedUser Jan 18 '20 at 15:16
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/206196/discussion-between-djsh-and-aksw). – DjSh Jan 18 '20 at 15:58

1 Answers1

0

Thanks to @AKSW I was able to solve my problem. This worked:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX uni: <http://www.example.com/university#>
select ?property (str(?value) as ?valueLiteral) (str(count(distinct ?ind)) as 
   ?noOfValueOccurrences)
          where { ?ind rdf:type uni:Lecturer. 
                  ?ind  ?property ?value.
                  ?property a owl:DatatypeProperty .}
group by ?property ?value
order by ?property
DjSh
  • 2,776
  • 2
  • 19
  • 32