2

I am using Jena TDB to persist RDF data. Before this I searched how to persist data in TDB and I came through the question at this link. Answer provided by Ryan clearly mentions the difference between various concepts, and one of the point I got about datasets is:

"A Dataset is like a DataSource, but its triples are static - you don't expect new ones to be added or existing ones to be deleted. These guys are read-only"

Keeping this in mind, I stored some rdf data in a named model within dataset. Now when I try to store/append some new data to this it clearly overwrites the previous one. So, this is doing clearly opposite to what Ryan has mentioned, i.e, read only nature. So the various points for which I need clarification include:

  1. Is Ryan correct about what he has discussed about dataset?
  2. If answer to point # 1 is yes, then why I am able to overwrite?
  3. Does TDB check for duplication before persisting data. I am asking this because I tried to insert a couple of duplicate RDF statements and I was expecting an increase in the count of rdf statements but there was no increase in count!
Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Haroon Lone
  • 2,837
  • 5
  • 29
  • 65
  • 1
    "A Dataset is like a DataSource, but its triples are static - you don't expect new ones to be added or existing ones to be deleted. These guys are read-only" That's simply not correct. The SPARQL standard includes *UPDATE*, *DELETE*, etc.,Ttat answer is from 2011, which is 2 years before SPARQL update was published. – Joshua Taylor Jun 06 '15 at 13:34

2 Answers2

1

You shouldn't expect triple count to increase when inserting into the same graph. I am guessing you are working on the default graph in TDB. You should be surprised if that was not the case, and most likely that would have been due to a bug in the underlying triple store. Multiple triples stating the same fact within the same context are just redundant.

If your intention is to collect facts from different contexts (e.g. different sources of information) then you can store the triples in separate graphs. In fact most triple stores are quad stores and allow you to do just that. TDB is a quad store, and you can load and work with multiple graphs. Read more about TDB datasets.

chris
  • 1,787
  • 1
  • 13
  • 13
0

After receiving feedback and tweaking Jena, I found answers to all points as:

  1. Is Ryan correct about what he has discussed about dataset?

    From Joshua's comment and reading API I found that Jena framework has been improved a lot, so Ryan's explanation about datasets is not valid anymore.

  2. Does TDB check for duplication before persisting data. I am asking this because I tried to insert a couple of duplicate RDF statements and I was expecting an increase in the count of rdf statements but there was no increase in count!

    I tried to insert the duplicate statements in the same named graph but I did not find any increase in the count. I believe that TDB is not checking for duplication (not mentioned in documentation), but what it does if a same statement already exists in the graph it simply replaces existing one with the new one. As a result of this it does not result in any count increase.

Haroon Lone
  • 2,837
  • 5
  • 29
  • 65