You will not be able to get a 100%-guaranteed complete result for this without reasoning. For example, rdfs:Resource
is quite often not explicitly used in a dataset at all, so without reasoning that class simply does not occur in your data. But you can come close enough to completeness for most practical purposes.
When is some resource known to be a class? There are a number of possibilities:
- when it is of type
rdfs:Class
;
- when it is the object of an
rdf:type
relation;
- when it is either the subject or the object of a
rdfs:subClassOf
relation;
- when it is used as the object of an
rdfs:domain
or rdfs:range
relation.
So, how to query this? Well the first one is easy:
SELECT ?c WHERE { ?c a rdfs:Class }
As an aside: in SPARQL, the keyword a
is a shorthand for the rdf:type
relation. Note that you do not need to query it transitively (so no *
is needed).
The second one is also easy:
SELECT ?c WHERE { [] rdf:type ?c }
The []
bit denotes an anonymous variable (since we are not interested in the subject value - we only want the object at this point).
The third one is a little trickier. We can split it up in two parts: first query all resources that are the subject of an rdfs:subClassOf
relation:
SELECT ?c WHERE { ?c rdfs:subClassOf [] }
Then query all resources that are the object of the relation:
SELECT ?c WHERE { [] rdfs:subClassOf ?c }
Now, let's see if we can combine these two patterns in a single query. We can't just put the two graph patterns in the same WHERE clause just like that, since that would make it a logical AND (meaning we'd only get back those resources that are both subject and object of a subclass relation). So we need to use a logical OR. In SPARQL this can be done with a UNION
:
SELECT ?c
WHERE {
{ ?c rdfs:subClassOf [] }
UNION
{ [] rdfs:subClassOf ?c }
}
However, in this case we can also express the logical OR in a different way, using a property path expression, like so:
SELECT ?c
WHERE { ?c rdfs:subClassOf|^rdfs:subClassOf [] }
The path expression p1|p2
means "match with any of these two relations". The operator ^
says "query the inverse of the relation". So if we put this together, the above path expression says "match with any triples where the subject is ?c
and the relation is rdfs:subClassOf
, or where the relation rdfs:subClassOf
is inverted and thus the object matches with ?c
".
We can query the fourth possibility in a similar fashion:
SELECT ?c
WHERE { [] rdfs:domain|rdfs:range ?c }
Putting it all together
SELECT ?c
WHERE {
{ ?c a rdfs:Class }
UNION
{ [] rdf:type ?c }
UNION
{ [] rdfs:domain|rdfs:range ?c }
UNION
{ ?c rdfs:subClassOf|^rdfs:subClassOf [] }
}
This can probably be shortened further. And of course you can simplify it further if you know your ontology well (for example if you know you never use domain or range properties, you can leave out that bit).