1

I am a newbie of Jena. I try to deal with the Yoga dataset using TDB. The dataset is about 200M and everytime I run the same query, it will have to take about 5 minutes to load the data then give out the results. I am wondering do I misunderstand any part of TDB? The following are my codes.

String directory = "tdb";
Dataset dataset = TDBFactory.createDataset(directory);      
dataset.begin(ReadWrite.WRITE);
Model tdb = dataset.getDefaultModel();
//String source = "yagoMetaFacts.ttl";
//FileManager.get().readModel(tdb, source);
String queryString = "SELECT DISTINCT ?p WHERE { ?s ?p ?o. }";
Query query = QueryFactory.create(queryString);
try(QueryExecution qexec = QueryExecutionFactory.create(query, tdb)){
    ResultSet results = qexec.execSelect();
    ResultSetFormatter.out(System.out, results, query) ;
}
dataset.commit();    
dataset.end();
Charlotte
  • 93
  • 11
  • Of course it does if you call `readModel`. Why don't you try it without this line? – UninformedUser Feb 16 '17 at 05:51
  • hi, i try to run without the readModel line, but i get no results then. if i do not indicate the dataset i need, how tdb know which dataset to use? – Charlotte Feb 16 '17 at 06:27
  • I think you have to call `tdb.commit`, see https://jena.apache.org/documentation/tdb/tdb_transactions.html#write-transactions – UninformedUser Feb 16 '17 at 13:42
  • Try tdbloader (command line) to load the data before running your programme. – AndyS Feb 16 '17 at 14:24
  • @AKSW Hi, I've modified my codes in the problem description but still get no results. I feel really confusing now. If I do not specify the location of the dataset, how can tdb know where it is? I think it has to load the data at least for one time? – Charlotte Feb 17 '17 at 00:36
  • @AndyS Hi, i try to use tdbloader in cmd and than run the program in eclipse and get the exception `Exception in thread "main" org.apache.jena.tdb.TDBException: Can't open database at location C:\XXX\yago_meta\tdb\ as it is already locked by the process with PID 8308. TDB databases do not permit concurrent usage across JVMs so in order to prevent possible data corruption you cannot open this location from the JVM that does not own the lock for the dataset` – Charlotte Feb 17 '17 at 00:39
  • Are you sure that you finished the whole `tdbloader` process (and without exceptions) **before** using your Java application? – UninformedUser Feb 17 '17 at 01:59

1 Answers1

2

There are two ways to load data into tdb, either by API or CMD. Much thanks to @ASKW and @AndyS

1 Load data via API

These codes need to be executed only once especially the readModel line which will takes long time.

String directory = "tdb";
Dataset dataset = TDBFactory.createDataset(directory);      
dataset.begin(ReadWrite.WRITE);
Model tdb = dataset.getDefaultModel();
String source = "yagoMetaFacts.ttl";
FileManager.get().readModel(tdb, source);
dataset.commit(); //Important!! This is to commit the data to tdb.   
dataset.end();

After the data is loaded into tdb, we can use following codes to query. And it is not necessary to load data again.

String directory = "path\\to\\tdb"; 
Dataset dataset = TDBFactory.createDataset(directory);
Model tdb = dataset.getDefaultModel(); 
String queryString = "SELECT DISTINCT ?p WHERE { ?s ?p ?o. }"; 
Query query = QueryFactory.create(queryString);
try(QueryExecution qexec = QueryExecutionFactory.create(query, tdb)){
     ResultSet results = qexec.execSelect();
     ResultSetFormatter.out(System.out, results, query) ;
}

2 Load data via CMD

To load data

>tdbloader --loc=path\to\tdb path\to\dataset.ttl

To query

>tdbquery --loc=path\to\tdb --query=q1.rq

q1.rq is the file which stores the query Should get results like this

-------------------------------------------------------
| p                                                   |
=======================================================
| <http://yago-knowledge.org/resource/hasGloss>       |
| <http://yago-knowledge.org/resource/occursSince>    |
| <http://yago-knowledge.org/resource/occursUntil>    |
| <http://yago-knowledge.org/resource/byTransport>    |
| <http://yago-knowledge.org/resource/hasPredecessor> |
| <http://yago-knowledge.org/resource/hasSuccessor>   |
| <http://www.w3.org/2000/01/rdf-schema#comment>      |
-------------------------------------------------------
Charlotte
  • 93
  • 11