What is exact implementation of Model data (or Graph) in Apache Jena?

Question

I understand Model like this :
It consists of triplets ( Subject Predicate/Property Object)

A B C
C D E
E F G
G H I
X Y Z

And We can represent above set of Triplets with Nodes and Edges in Graph.
I want to get Subject 'A' value which can have chaining like above C->E->G->I in Model and in terms of Graph It should return SubGraph from 'C' node.

Here is my Recursive code:

 public Model getRecursive(String subject) {
    Model newModel = ModelFactory.createDefaultModel();

    StmtIterator it = this.model.listStatements();
    while (it.hasNext()) {
        Statement statement = it.next();
        if(statement.getSubject().toString().equals(subject))
        {
            newModel = newModel.add(statement);
        }
    }

    Model objectModel = ModelFactory.createDefaultModel();
    it = newModel.listStatements();

    while (it.hasNext()) {
        Statement statement = it.next();
        objectModel = objectModel.add(getRecursive(statement.getObject().toString()));
    }

    newModel = newModel.add(objectModel);

    return newModel;
}

But my problem is its complexity is too high.
Assume Model have 1000 triplets and a Subject assume 'A' have chain length of 10.Then According to my code , time complexity is 10*1000 because my recursive code for each call iterate through whole triplets to find triplets with current given Subject and then further recursively call on Object values.

Is there any other way to do it fastly ? I didn't get any methods in graph and Model which can do it fastly .

1) StmtIterator sIter = model.listStatements(subject, null, (RDFNode)null) 2) And SPARQL query SELECT * WHERE { ?predicate ?object . } .These two statement will give same result .Both will take same time or not ? — Badman, Oct 05 '16 at 10:27

score 1 · Answer 1 · answered Oct 05 '16 at 09:14

1

Jena models are indexed - you don't have to scan them to find things.

First, work with resources, not strings:

Resource subject = model.getResource(uristring) ;

and then pass in an accumulator:

Model acc = ModelFactory.createDefaultModel();

so as not to copy results all the time.

recurse(Resource start, Model acc) ;

To access the model, use listStatements(s,p,o), which takes arguments as to what you are looking for.

StmtIterator sIter = model.listStatements(subject, null, (RDFNode)null) ;

finds statement with that subject only.

which is packaged up as as method on resource:

StmtIterator sIter = subject.listProperties() ;

(subject knows which model it is in).

In addition, you should check for cycles or it will recurse forever.

answered Oct 05 '16 at 09:14

AndyS

16,345
17
21

I didn't get you " Jena models are indexed ".Is it hashMap implemented? What is time Complexity for StmtIterator sIter = model.listStatements(subject, null, (RDFNode)null) and StmtIterator sIter = model.listStatements(null, prop , (RDFNode)null) ?Is it O(n) or O(1) or O(log(n))? – Badman Oct 05 '16 at 11:03
What if Resource is blank Node? Resource subject = model.getResource(blankNodeString) .StmtIterator sIter = model.listStatements(subject, null, (RDFNode)null) ; This is not working how to do that for Blank Node is Resource? – Badman Oct 05 '16 at 11:20
There is no name for a blank node. You need to find the starting point somehow. After that, there is no "createResource" used - blank nodes are passed around as Resources. – AndyS Oct 06 '16 at 11:28
The indexing is hash-based (hash to triples of same subject in this case). The algorithm I outlined is of complexity dependent on the data traversed, not the total amount of data. Average fan out * depth of walk. – AndyS Oct 06 '16 at 11:31

What is exact implementation of Model data (or Graph) in Apache Jena?

1 Answers1