Questions tagged [tdb]

TDB is an open source RDF database developed and maintained by the Apache Jena project

TDB is an open source RDF database developed and maintained by the Apache Jena project. It is a Java based embedded database that may be exposed over HTTP using the Fuseki Server also from the Apache Jena project.

Features

It's features includes the following:

  • Persisting RDF Triples or Quads
  • Full SPARQL execution
  • Write ahead logging to provide serializable transactions and fault tolerance

Scalability and Performance

TDB typically scales up to datasets of a few hundred million triples or quads.

Compared to commercial alternatives TDB is often less scalable compared primarily because while it is persisted to disk it is fundamentally designed as an in-memory database. It relays heavily on RAM caches and memory mapped files so scalability tends to be limited by machine RAM.

Also TDB does not have any option of a clustered mode so cannot be scaled horizontally without additional technologies (e.g. manually created replicas and load balancers)

TDB is typically included in the Berlin SPARQL Benchmark Results for those interested in comparative performance data.

137 questions
2
votes
0 answers

How to filter down a large Jena Model in TDB

I have a large RDF model that doesn't fit in memory. I am currently loading the entire thing into TDB, but I would like to instead filter it down by focusing on only a subgraph (all properties about all resources which are subclassof or type of…
lmsurprenant
  • 1,723
  • 2
  • 14
  • 28
2
votes
1 answer

Adding Individuals in Jena OntMOdel and accessing it. Exception ObjectFileStorage.read Impossibly large object

I am trying to add some individuals to my existing Ontology (OntModel) with an objective to add the values/literals for DatatypeProperty with a specific datatype known at runtime from the range of the datatypeproperty. My OntModel is backed by a…
2
votes
2 answers

Jena Rule Engine with TDB

I am having my data loaded in TDB model and have written some rule using Jena in order to apply into TDB. Then I am storing the inferred data into a new TDB. I applied the case above in a small dataset ~200kb and worded just fine. HOWEVER, my…
Dr.AdeeB
  • 31
  • 1
  • 4
2
votes
1 answer

Jena TDB insert statement resulting in empty fields

I am using Jena APIs to insert and update triples in Jena TDB. My design is such that each of the insert operation is within the transaction control. For example: dataset.begin (ReadWrite.WRITE) try { // 1st insert operation …
1
vote
0 answers

How can I make tdb2.xloader continue loading when the loading accidentally interrupt?

Apache Version 4.8.0 Question This afternoon I was using tdb2.xloader to load a 120G ttl file into the database, but it was accidentally interrupted halfway, but the loading has reached the POS stage, Is there any chance I can continue building on…
unstuck
  • 31
  • 3
1
vote
0 answers

docker mkdir won't create a directory

I am trying to run a bash script which should load data into jena. This script comes from a github repository and was allegedly working on the owner's machine but on mine it won't run, even though I followed the instructions. So let me first…
Greenfish
  • 358
  • 2
  • 5
  • 19
1
vote
1 answer

Trying to load Wikidata truthy-latest.nt with tdb2.tdbloader results in Code: 58/PROHIBITED_COMPONENT_PRESENT in USER

With Apache Jena Fuseki I am trying to load the latest-truthy.nt dataset from Wikidata, but I am getting the following error while trying to import the file. With the inspiration from the following success from Bitplan where they did have…
NLxDoDge
  • 189
  • 2
  • 13
1
vote
1 answer

Does anyone know how to get the tdb2.dump command to actually do anything

I'm trying to dump a jena database as triples. There seems to be a command that sounds perfectly suited to the task: tdb2.dump jena@debian-clean:~$ ./apache-jena-3.8.0/bin/tdb2.tdbdump --help tdbdump : Write a dataset to stdout (defaults to…
Ben Hillier
  • 2,126
  • 1
  • 10
  • 15
1
vote
1 answer

How to Build a Fuseki TDB over a POST HTTP Request with Assembler.ttl? (How to send File over POST request)

https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html Reading the documentation, we see that we are able to send a POST request containing assembler .ttl definitions to Fuseki Endpoint. Although, when trying that, my application…
1
vote
0 answers

JENA TDB load progression using Java API

I am wondering if it is possible to get load speed information when using the Java API. The code I have to load "large" files (few gb) is this: try (InputStream in = new FileInputStream(arguments.input)) { RDFParser.create() …
Jasper
  • 628
  • 1
  • 9
  • 19
1
vote
1 answer

What does Apache Jena's tdb2.tdbcompact do?

I have read the description of this command, but still don't know what it compresses, and why I should use it? BTW: the subdirectory Data-NNN is for previous and current versions of the databases, which means it can only have 1000 versions of the…
Gao
  • 912
  • 6
  • 16
1
vote
1 answer

Is jena's fuseki not compactible with tdb2.tdbloader?

I have a requirement that incremental update tdb files daily. So I'm using tdb2.tdbloader to do the job with a generated N-Triples file. But when the job is done, the data directory which contains tdb data have a new directory called "data-0001" or…
Gao
  • 912
  • 6
  • 16
1
vote
1 answer

Jena TDB Dataset begin() fails

I want to use Jena TDB in a project. This is what I added in my POM: org.apache.jena apache-jena-libs 3.7.0 pom These are my…
Janothan
  • 446
  • 4
  • 16
1
vote
1 answer

SPARQL : OFFSET whithout ORDER BY to get all results of a query?

I have a large TDB dataset (cf. this post Fuseki config for 2 datasets + text index : how to use turtle files? ) and I need to extract data in order to make a "subgraph" and import it in fuseki. I found that OFFSET could be a solution to get all…
vvffl
  • 73
  • 1
  • 9
1
vote
1 answer

Fuseki config for 2 datasets + text index : how to use turtle files?

I'm new to fuseki and want to use 2 TDB datasets for our project : a small one for our own data, and a large one (168 M triples, imported data from http://data.bnf.fr). We need to index the data because SPARQL queries using "FILTER(CONTAINS())"…
vvffl
  • 73
  • 1
  • 9
1 2
3
9 10