0

I have to serialize some specific properties (about ten film's properties) for a set of 1500 entity from DBpedia. So for each entity I run a sparql query in order to retrieve them and after that, for each ResultSet I store all the data in the tdb dataset using the default apache jena tdb API. I create a single statement for each property and I add them using this code:

public void addSolution(QuerySolution currSolution, String subjectURI) {
    if(isWriteMode) {
        Resource currResource = datasetModel.createResource(subjectURI);

        Property prop = datasetModel.createProperty(currSolution.getResource("?prop").toString());
        Statement stat = datasetModel.createStatement(currResource, prop, currSolution.get("?value").toString());
        datasetModel.add(stat);
    }
}

How can I do in order to execute multiple add operations on a single dataset? What's the strategy that I should use?

EDIT:

I'm able to execute all the code without errors, but no files were created by the TDBFactory. Why this happens? I think that I need Joshua Taylor's help

Alessandro Suglia
  • 1,907
  • 1
  • 16
  • 23

2 Answers2

2

It sounds like the query is running over the remote dbpedia endpoint. Assuming that's correct you can do a couple of things.

Firstly wrap the update in a transaction:

dataset.begin(ReadWrite.WRITE);
try {
  for (QuerySolution currSolution: results) {
    addSolution(...);
  }
  dataset.commit();
} finally {
  dataset.end();
}

Secondly, you might be able to save yourself work by using CONSTRUCT to get a model back, rather than having to loop through the results. I'm not clear what's going on with subjectURI, however, but it might be as simple as:

CONSTRUCT { <subjectURI> ?prop ?value }
WHERE
... existing query body ...
user205512
  • 8,798
  • 29
  • 28
  • Yes it is running on the dbpedia endpoint. I already do that. Commit after each result set was examined but how many commit I can do before I get an error? – Alessandro Suglia Jun 26 '14 at 15:33
  • The size isn't important. The issue is that the dataset is being closed with an outstanding transaction, i.e. a `dataset.end()` is missing somewhere. – user205512 Jun 26 '14 at 15:44
  • But when I need to do that? At the end of the whole process? Or when I finish a single result set? – Alessandro Suglia Jun 26 '14 at 15:46
  • Depends what makes sense for your application. Do you want to have partial data if something goes wrong, or go back to having nothing added? For the former commit after each result set. – user205512 Jun 26 '14 at 17:23
  • I receive this error when I try to commit the first time: 14/06/26 19:09:09 WARN impl.Log4jLoggerAdapter: Inconsistency: base.allocOffset() = 5532 : allocOffset = 0 The second time: Exception in thread "main" com.hp.hpl.jena.sparql.JenaTransactionException: Not in a transaction (location:--mem--/) After this the exception doesn't grant to me to go ahead... – Alessandro Suglia Jun 26 '14 at 19:11
  • Why do you use: dataset.begin(ReadWrite.READ); – Alessandro Suglia Jun 26 '14 at 19:15
0

I've solved my problem and I want to put here the problem that I've got for anyone will have the same. For each transaction that you do, you need to re-obtain the dataset model and don't use the same for all the transaction.

So for each transaction that you start you need to obtain the dataset model just after the call to begin(). I hope that will be helpful.

Alessandro Suglia
  • 1,907
  • 1
  • 16
  • 23