1

Please bear with me as I am new to semantic technologies.

I am trying to use the package rdflib to extract labels from classes in ontologies. However some ontologies don't contain the labels themselves but have the URIs of classes from other ontologies. How does one extract the labels from URIs of the external ontologies?

The intuition behind my attempts center on identifying classes that don't contain labels locally (if that is the right way of putting it) and then "following" their URIs to the external ontologies to extract the labels. However the way I have implemented it does not work.

import rdflib

g = rdflib.Graph()

# I have no trouble extracting labels from this ontology:
# g.load("http://purl.obolibrary.org/obo/po.owl#")
# However, this ontology contains no labels locally:
g.load("http://www.bioassayontology.org/bao/bao_complete.owl#")


owlClass = rdflib.namespace.OWL.Class
rdfType = rdflib.namespace.RDF.type

for s in g.subjects(predicate=rdfType, object=owlClass):
    # Where label is present...
    if g.label(s) != '':
        # Do something with label...
        print(g.label(s))

    # This is what I have added to try to follow the URI to the external ontology.
    elif g.label(s) == '':
        g2 = rdflib.Graph()
        g2.parse(location=s)
        # Do something with label...
        print(g.label(s))


Am I taking completely the wrong approach? All help is appreciated! Thank you.

Lorcán
  • 555
  • 3
  • 15

1 Answers1

1

I think you can be much more efficient than this. You are trying to do a web request, remote ontology download and search every time you encounter a URI that doesn't have a label given in http://www.bioassayontology.org/bao/bao_complete.owl which is most of them and it's a very large number. So your script will take forever and thrash the web servers delivering those remote ontologies.

Looking at http://www.bioassayontology.org/bao/bao_complete.owl, I see that most of the URIs without labels there are from OBO, and perhaps a couple of other ontologies, but mostly OBO.

What you should do is download OBO once and load that with RDFlib. Then if you run your script above on the joined (union) graph of http://www.bioassayontology.org/bao/bao_complete.owl & OBO, you'll have all OBO's content at your fingertips so that g.label(s) will find a much higher proportion of labels.

Perhaps there are a couple of other source ontologies providing labels for http://www.bioassayontology.org/bao/bao_complete.owl you may need as well but my quick browsing sees only OBO.

Nicholas Car
  • 1,164
  • 4
  • 7