Traversing anonymous/blank nodes in Jena

Question

I am using Apache Jena's API, where a graph contains some anonymous/blank nodes as well, due to unionOf and intersectionOf. One of such example is:

<owl:Class>
   <owl:unionOf rdf:parseType="Collection">
        <rdf:Description rdf:about="http://www.summyUrl.com/something#Entity1"/>
        <rdf:Description rdf:about="http://www.summyUrl.com/something#Entity2"/>
   </owl:unionOf>
</owl:Class>

which is an anonymous node/resource. When I try to get its URI, it is something like:

"-50a5734d:15d839467d9:-1b8b"

I am neither able to do SPARQL query using such URIs (due to exception on parsing such URIs), nor able to find appropriate Jena method to handle it.

I am looking for a way to explode such nodes and get all the nested resources of it.

For example in below case, it should return <http:/.../Entity1>, <http:/.../Entity2> and <http:/.../Entity3>

<owl:Class>
   <owl:unionOf rdf:parseType="Collection">
        <rdf:Description rdf:about="http://www.summyUrl.com/something#Entity1"/>
        <owl:unionOf rdf:parseType="Collection">
            <rdf:Description rdf:about="http://www.summyUrl.com/something#Entity2"/>
            <rdf:Description rdf:about="http://www.summyUrl.com/something#Entity3"/>
        </owl:unionOf>
   </owl:unionOf>
</owl:Class>

Is there any inbuilt method of Jena to handle it?
If not, how can I do it efficiently?

You should always look at the Turtle serialization of the data and not the RDF/XML one. Then you'll see that you can use SPARQL property paths. Indeed that doesn't work for arbitrary nested class, but therefore as it is OWL an OWL reasoner is the way to go — UninformedUser, Jul 28 '17 at 06:10
It would be good to see the whole query first, but in principle the common pattern is `?subclass rdfs:subClassOf/(owl:unionOf/rdf:rest*/rdf:first)+ ?superclass` if you're looking for superclass defined by a union of classes in OWL. — UninformedUser, Jul 28 '17 at 06:12

Pratik · Accepted Answer · 2017-07-31T04:42:22.087

I tried doing it in this way and it worked nicely:

/**
 * Explodes <b>Anonymous resource</b> (Collection resource) in recursive way and provides
 * nested resources. Mainly considers <code>owl:unionOf</code>, <code>owl:intersactionOf</code>, <code>rdf:first</code> and <code>rdf:rest</code>
 * while traversing.
 * 
 * @param resource
 * @return LinkedList<Resource>
 */
private List<Resource> explodeAnonymousResource(Resource resource)
{
    private static List<Property> collectionProperties = new LinkedList<Property>(Arrays.asList(OWL.unionOf,OWL.intersectionOf,RDF.first,RDF.rest));

    List<Resource> resources=new LinkedList<Resource>();
    Boolean needToTraverseNext=false;

    if(resource.isAnon())
    {
        for(Property cp:collectionProperties)
        {
            if(resource.hasProperty(cp) && !resource.getPropertyResourceValue(cp).equals(RDF.nil))
            {
                Resource nextResource=resource.getPropertyResourceValue(cp);
                resources.addAll(explodeAnonymousResource(nextResource));

                needToTraverseNext=true;
            }
        }

        if(!needToTraverseNext)
        {
            resources.add(resource);
        }
    }
    else
    {
        resources.add(resource);
    }

    return resources;
}

ssz · Answer 2 · 2017-07-29T17:34:08.293

Using jena-model-api:

        String s = "<rdf:RDF\n" +
            "    xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"\n" +
            "    xmlns:dc=\"http://purl.org/dc/elements/1.1/\"\n" +
            "    xmlns:owl=\"http://www.w3.org/2002/07/owl#\"\n" +
            "    xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"\n" +
            "    xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\">\n" +
            "  <owl:Ontology/>\n" +
            "  <owl:Class>\n" +
            "    <owl:unionOf rdf:parseType=\"Collection\">\n" +
            "      <owl:Class rdf:about=\"http://www.summyUrl.com/something#Entity1\"/>\n" +
            "      <owl:Class>\n" +
            "        <owl:unionOf rdf:parseType=\"Collection\">\n" +
            "          <owl:Class rdf:about=\"http://www.summyUrl.com/something#Entity1\"/>\n" +
            "          <owl:Class rdf:about=\"http://www.summyUrl.com/something#Entity2\"/>\n" +
            "        </owl:unionOf>\n" +
            "      </owl:Class>\n" +
            "    </owl:unionOf>\n" +
            "  </owl:Class>\n" +
            "</rdf:RDF>";
    Model m = ModelFactory.createDefaultModel();
    try (InputStream in = new ByteArrayInputStream(s.getBytes(StandardCharsets.UTF_8))) {
        m.read(in, Lang.RDFXML.getLabel());
    }
    //m.write(System.out, "ttl");
    m.listStatements()
            .mapWith(Statement::getObject)
            .filterKeep(RDFNode::isURIResource)
            .mapWith(RDFNode::asResource)
            .filterDrop(OWL.Class::equals)
            .filterDrop(OWL.Ontology::equals)
            .filterDrop(RDF.nil::equals)
            .mapWith(Resource::getURI)
            .forEachRemaining(System.out::println);

The output:

http://www.summyUrl.com/something#Entity1
http://www.summyUrl.com/something#Entity2
http://www.summyUrl.com/something#Entity1

This is just an example. There are a lot of ways to handle anything

Traversing anonymous/blank nodes in Jena

2 Answers2