2

My question is about using SPARQL to query some owl ontology where owl:Restrictions are heavily used (in my case this is the "Cell Ontology").

Here is an example of some typical entry (in Turtle format, extracted from the above mentioned ontology):

###  http://purl.obolibrary.org/obo/CL_0000792
obo:CL_0000792 rdf:type owl:Class ;
           owl:equivalentClass [ owl:intersectionOf ( obo:CL_0000624
                                                      [ rdf:type owl:Restriction ;
                                                        owl:onProperty obo:RO_0002104 ;
                                                        owl:someValuesFrom obo:PR_000001380
                                                      ]
                                                      [ rdf:type owl:Restriction ;
                                                        owl:onProperty obo:RO_0002215 ;
                                                        owl:someValuesFrom obo:GO_0050777
                                                      ]
                                                      [ rdf:type owl:Restriction ;
                                                        owl:onProperty <http://purl.obolibrary.org/obo/cl#has_low_plasma_membrane_amount> ;
                                                        owl:someValuesFrom obo:PR_000001869
                                                      ]
                                                    ) ;
                                 rdf:type owl:Class
                               ] ;
           rdfs:subClassOf obo:CL_0000624 ,
                           obo:CL_0000815 ,
                           [ rdf:type owl:Restriction ;
                             owl:onProperty obo:RO_0002104 ;
                             owl:someValuesFrom obo:PR_000001380
                           ] ,
                           [ rdf:type owl:Restriction ;
                             owl:onProperty obo:RO_0002215 ;
                             owl:someValuesFrom obo:GO_0050777
                           ] ,
                           [ rdf:type owl:Restriction ;
                             owl:onProperty <http://purl.obolibrary.org/obo/cl#has_low_plasma_membrane_amount> ;
                             owl:someValuesFrom obo:PR_000001869
                           ] .

Here my ultimate goal is to transfer the owl equivalent properties to subClassOf properties:

CL_0000792 rdfs:subClassOf [
     rdf:type owl:Restriction ;
              owl:onProperty obo:RO_0002104 ;
              owl:someValueFrom obo:PR_000001380
 ] ;
 rdfs:subClassOf [
     rdf:type owl:Restriction ;
              owl:onProperty obo:cl#has_low_plasma_membrane_amount ;
              owl:someValueFrom obo:PR_000001869
 ] .

What I do not achieve is to obtain all three properties from the rdfs:subclass part and then bind them properly to the subClassOf sorts of properties (then filtering out the obo:RO_0002215 would be easy).

EDIT: As I made some progress here a new SPARQL Query

EDIT2: following Damyan Ognyanov's answer updated the SPARQL query part which was ignoring the collection within the owl:intersectionOf part and which also more compact/elegant

Here my current SPARQL query:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX tpo: <http://www.definiens.com/ontologies/TissuePhenomicsOntology>
PREFIX cl: <http://purl.obolibrary.org/obo/cl.owl>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX efo: <http://www.ebi.ac.uk/efo/efo.owl>

CONSTRUCT {
    ?cell rdfs:subClassOf [ rdf:type owl:Restriction ;
                            owl:onProperty ?cellProp ;
                            owl:someValuesFrom ?cellPropValue
                            ] .
    ?cellProp ?cellPropProp ?cellPropObj .
    ?cellPropValue ?cellPropValueProp ?cellPropValuePropValue .
    ?cell ?cellProp2 ?cellProp2Obj .
}

FROM named cl:
FROM named tpo:
WHERE {
    # query cl to get our information
    graph cl:
    {   
       ?cell (rdfs:subClassOf|(owl:equivalentClass/owl:intersectionOf/rdf:rest*/rdf:first)) ?x .
       ?x owl:onProperty ?cellProp ;
          owl:someValuesFrom ?cellPropValue .

        ?cellProp ?cellPropProp ?cellPropObj . 
        ?cellPropValue ?cellPropValueProp ?cellPropValuePropValue .
        ?cell ?cellProp2 ?cellProp2Obj .
    }

    # limit ?cell to the entries already present in TPO
    graph tpo:
    {
        ?cell rdfs:subClassOf* obo:CL_0000000 .
    }
}    

If you replace the CONSTRUCT part with a SELECT * then it appears that all variables are correctly assigned, the information is there.

What I am still missing though is a proper CONSTRUCT part to reconstruct the "somewhat convoluted" owl:Property restriction. As such this query mostly returns a long list of blank nodes, which won't be parsed properly by Protege for instance.

@AKSW also rightly pointed out that SPARQL may not be the tool of choice to query and construct OWL graphs. It indeed appears clearly here that one needs to know the precise data structure in order to build a working query, in this manner at least.

?cell (rdfs:subClassOf|(owl:equivalentClass/owl:intersectionOf/rdf:rest*/rdf:first)) ?x . ?x owl:onProperty ?cellProp ; owl:someValuesFrom ?cellPropValue .

?cellProp ?cellPropProp ?cellPropObj . ?cellPropValue ?cellPropValueProp ?cellPropValuePropValue . ?cell ?cellProp2 ?cellProp2Obj .

gpotdevin
  • 33
  • 7
  • What you see in Protege as garbage are illegal/incomplete OWL constructs created by your SPARQL query. The RDF parser then indeed fails to parse the OWL construct properly. – UninformedUser Sep 11 '18 at 10:25
  • I don't get your WHERE part. I thought, the idea is to "split" the `owl:equivalentClass` axiom? But your WHERE part is looking for `rdfs:subClassOf` – UninformedUser Sep 11 '18 at 10:27
  • Indeed this is fool, I was tricked by the same information being present twice in the example (and in some other cases in the original data). I just tried to replace the `rdfs:subClassOf` with `owl:equivalentClass` predicate but this fails, obviously because of the `owl:intersection`. The failure of the original query and modified one (though I did not update the question) convince me that I missed part of the logic here on how to query these typical owl constructs. – gpotdevin Sep 11 '18 at 10:54
  • The problem is that SPARQL is made for RDF, and although there is an RDF mapping of the OWL constructs, those might be represented by arbitrarily nested set of triples, thus, without knowing the occuring OWL constructs in advance, it's almost impossible to do this via SPARQL. Why not using an API made for OWL, e.g. the OWL API. Would need ~10 lines of code. – UninformedUser Sep 11 '18 at 12:05
  • With SPARQL, you could try the wildcard pattern `(

    |!

    )` for property paths (varibales aren't allowed in property paths), but I'm not sure whether this would help in the end.

    – UninformedUser Sep 11 '18 at 12:08
  • mmmh indeed I see the point. It is indeed part of my original intention to 'duplicate' the semantic information in order to eventually simplify queries in a production environment. I will look into the wildcard pattern approach to solve the original question, as well as into a more dedicated tool / API such as OWL API. – gpotdevin Sep 11 '18 at 12:20
  • "_as the virtuoso reasoner does not handle `owl:equivalentClass` it seems_" Actually, it seems you haven't read [the relevant Virtuoso documentation](http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksGuideRDFSchemaOWLInferenceRules), nor asked a Virtuoso-tagged question here on SO, nor written to [the Virtuoso Users mailing list](https://lists.sourceforge.net/lists/listinfo/virtuoso-users/) about about `owl:equivalentClass` reasoning.... After which, it seems that addressing the rest of this question will be moot. – TallTed Sep 12 '18 at 02:45
  • Removed the comment about virtuoso, though I am not sure this is an important point for this question. Or would you explain how the reasoner might help here? As of the virtuoso mailing list, sorry this question is not virtuoso specific, and as a matter of facts the virtuoso reasoner correctly inferred the properties under rdfs:subClassOf but missed these under owl:equivalentClass (maybe the owl:intersectionOf played a role too?) – gpotdevin Sep 12 '18 at 06:59
  • I may have focused too much on your comment about Virtuoso, which suggested to me that you thought that if Virtuoso handled `owl:equivalentClass` (which it does), you wouldn't need to "transfer `owl:equivalentClass` to `rdfs:subClassOf` (`owl:Restriction`) properties", and so your question would be moot. That not being so, the answer from @damyan-ognyanov seems likely to be on the right track. – TallTed Sep 12 '18 at 20:25

1 Answers1

1

The value of owl:intersectionOf is an RDF list and the above Turtle snippet uses the RDF list syntax to enumerate the members of the owl:intersectionOf collection (e.g., entries are enclosed between ( and )).

So, you should also include rdf:rest*/rdf:first properties in your property paths, since these are used to construct the collection.

For the query, I'll introduce an additional variable where to bind the restriction of interest and use it to fetch the values of owl:onProperty and owl:someValuesFrom, e.g., your WHERE clause may look something like:

?cell (rdfs:subClassOf|(owl:equivalentClass/owl:intersectionOf/rdf:rest*/rdf:first)) ?x .
?x owl:onProperty ?cellProp ;
   owl:someValuesFrom ?cellPropValue .

?cellProp ?cellPropProp ?cellPropObj . 
?cellPropValue ?cellPropValueProp ?cellPropValuePropValue .
?cell ?cellProp2 ?cellProp2Obj .
TallTed
  • 9,069
  • 2
  • 22
  • 37
Damyan Ognyanov
  • 791
  • 3
  • 7
  • Indeed the query in the question did not retrieve the members of `owl:intersectionOf` collections. Thanks for this, I updated the query in the quesion! I still have some issues to build a valid ontology but I suspect the problem is unrelated (at some point a blank node is created which is nowhere defined so protege stops and returns an error). – gpotdevin Sep 13 '18 at 10:14