1

Let's say I have the following:

@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .

and I run the following SPARQL query:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sch:  <http://schema.org/>

SELECT DISTINCT ?s
WHERE {
    {
        ?s ?p ?o .
        FILTER NOT EXISTS {
            ?s a ?c .
            FILTER(?c IN (rdfs:Class, rdf:Property))
        }
    }
}

The results returned will be:

----------------------------------------------------------------
| s                                                            |
================================================================
| <http://learningsparql.com/ns/humanResources#Another>        |
| <http://learningsparql.com/ns/humanResources#BadOne>         |
| <http://learningsparql.com/ns/humanResources#BadTwo>         |
| <http://learningsparql.com/ns/humanResources#YetAnother>     |
| <http://learningsparql.com/ns/humanResources#BadThree>       |
| <http://learningsparql.com/ns/humanResources#AnotherName>    |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> |
----------------------------------------------------------------

The only two results I want returned are:

----------------------------------------------------------------
| s                                                            |
================================================================
| <http://learningsparql.com/ns/humanResources#BadOne>         |
| <http://learningsparql.com/ns/humanResources#BadTwo>         |
| <http://learningsparql.com/ns/humanResources#BadThree>       | 
----------------------------------------------------------------

What makes them bad? If I look at the rdf:type information, the type chain does not terminate in a type that is a rdfs:Class or rdf:Property.

If I look at hr:YetAnother, it has a rdf:type of hr:Another. hr:Another has a rdf:type of hr:Employee. So, the chain of types from hr:YetAnother and hr:Another terminates in a rdfs:Class and they should not be returned by a query.

In my example the type chain's are small, but there could be more links in the chain making them longer.

Is it possible to write such a query with SPARQL? If so, what would that query be?

James Hudson
  • 844
  • 6
  • 19

1 Answers1

1

The SPARQL feature required to solve this problem is called Property Paths.

The following query:

SELECT DISTINCT ?s
WHERE {
    {
        ?s ?p ?o .    
        FILTER NOT EXISTS {
            ?s rdf:type* ?c .
             FILTER(?c IN (rdfs:Class, rdf:Property) && ?s NOT IN (rdfs:Class, rdf:Property) )
        }
    }
}

will return the expected results:

----------------------------------------------------------
| s                                                      |
==========================================================
| <http://learningsparql.com/ns/humanResources#BadOne>   |
| <http://learningsparql.com/ns/humanResources#BadTwo>   |
| <http://learningsparql.com/ns/humanResources#BadThree> |
----------------------------------------------------------

Breaking the query does to better understand what is going on, consider,

(A)

SELECT DISTINCT *
WHERE {
    {
        ?s ?p ?o .       
    }
}

which will return the following results:

-------------------------------------------------------------------------------------------------------------------------------------------
| s                                                            | p            | o                                                         |
===========================================================================================================================================
| <http://learningsparql.com/ns/humanResources#Another>        | rdf:type     | <http://learningsparql.com/ns/humanResources#Employee>    |
| <http://learningsparql.com/ns/humanResources#BadOne>         | rdf:type     | <http://learningsparql.com/ns/humanResources#Dangling>    |
| <http://learningsparql.com/ns/humanResources#BadTwo>         | rdf:type     | <http://learningsparql.com/ns/humanResources#BadOne>      |
| <http://learningsparql.com/ns/humanResources#Employee>       | rdf:type     | rdfs:Class                                                |
| <http://learningsparql.com/ns/humanResources#YetAnother>     | rdf:type     | <http://learningsparql.com/ns/humanResources#Another>     |
| <http://learningsparql.com/ns/humanResources#BadThree>       | rdfs:comment | "some comment about missing"                              |
| <http://learningsparql.com/ns/humanResources#AnotherName>    | rdf:type     | <http://learningsparql.com/ns/humanResources#name>        |
| <http://learningsparql.com/ns/humanResources#name>           | rdf:type     | rdf:Property                                              |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> | rdf:type     | <http://learningsparql.com/ns/humanResources#AnotherName> |
-------------------------------------------------------------------------------------------------------------------------------------------

then, consider the following query:

(B)

SELECT DISTINCT ?s
WHERE {
    {
        ?s rdf:type* ?c .
        FILTER(?c IN (rdfs:Class, rdf:Property) && ?s NOT IN (rdfs:Class, rdf:Property) )

    }
}

which returns the results:

----------------------------------------------------------------
| s                                                            |
================================================================
| <http://learningsparql.com/ns/humanResources#Employee>       |
| <http://learningsparql.com/ns/humanResources#Another>        |
| <http://learningsparql.com/ns/humanResources#YetAnother>     |
| <http://learningsparql.com/ns/humanResources#name>           |
| <http://learningsparql.com/ns/humanResources#AnotherName>    |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> |
----------------------------------------------------------------

By placing (B) in FILTER NOT EXISTS, the subjects found in (A) are removed, leaving only the desired results.

James Hudson
  • 844
  • 6
  • 19