0

I have documents represented by RDF triples and some users can add relationships between those documents. The way I plan to record those relationships is following: (subset of RDF/XML code)

<rdf:Description rdf:about="SOURCEDocId">
    <kb:tocMember rdf:resource="TARGETDocId"/>
</rdf:Description>

<rdf:Description rdf:about="TARGETDocId">
    <kb:isInToc rdf:resource="SOURCEDocId"/>
</rdf:Description> 

(relationships are established in a Table Of Content, so tocMember and isInToc names).

But I now need to store the UserId who created this relationship. One heretic way could be to add an attrubute, something as:

<rdf:Description rdf:about="SOURCEDocId">
    <kb:tocMember xml:createdBy="USERId" rdf:resource="TARGETDocId"/>
</rdf:Description>

<rdf:Description rdf:about="TARGETDocId">
    <kb:isInToc xml:createdBy="USERId" rdf:resource="SOURCEDocId"/>
</rdf:Description>

I am not sure this will be accepted by RDF triplestores and also this information cannot be used in SPARQL requests. There is also the possibility to create a link entity and qualify it but this is a mess for a very small requirement to implement. So a better way?

Okilele
  • 85
  • 1
  • 5
  • Check out the available provenance methods: RDF reification, singleton property, RDF* and other approaches such as the one of Wikidata. – Ivo Velitchkov Apr 05 '20 at 12:07
  • `xml:createdBy="USERId"` - what is that? RDF is not XML ... eitehr reifiication or n-ary relations - that's it. – UninformedUser Apr 05 '20 at 13:36
  • 2
    By the way, showing RDF data as XML is worst case in my opinion. What not using more readable formats like N-Triples, Turtle etc. ? There is more than RDF/XML available – UninformedUser Apr 05 '20 at 13:37
  • "xml:createdBy="USERId"" was said in my question as an heretic thing. And RDF/XML the simpliest way to expose the case – Okilele Apr 06 '20 at 13:15

1 Answers1

1

Statements about statements can be represented in RDF by:

1) RDF reification

2) n-ary relations

3) Singleton property

4) named graph

5) using RDF*

Each option has advantages and disadvantages.

Here's how your case will be represented using RDF reification (the example is with the first statement):

:SOURCEDocId-tocMember-TARGETDocId
  rdf:type rdf:Statement ;
  :createdBy :USERId ;
  rdf:object :TARGETDocId ;
  rdf:predicate kb:tocMember ;
  rdf:subject :SOURCEDocId .

As commented, using Turle makes it clear and readable. Yet, since you gave the example in RDF/XML, the reification would look like this, serialized in XML:

  <rdf:Statement rdf:ID="SOURCEDocId-tocMember-TARGETDocId">
    <createdBy rdf:resource="#USERId"/>
    <rdf:subject rdf:resource="#SOURCEDocId"/>
    <rdf:predicate rdf:resource="http://example.org/kb/tocMember"/>
    <rdf:object rdf:resource="#TARGETDocId"/>
  </rdf:Statement>

In practice statements are often not given URI, but left as blank nodes:

[
  rdf:type rdf:Statement ;
  :createdBy :USERId ;
  rdf:object :TARGETDocId ;
  rdf:predicate kb:tocMember ;
  rdf:subject :SOURCEDocId .
]

Here :createdBy is shown as locally created but it would of course be better if you reuse a property from appropriate vocabulary, such as dc:creator from Dublic Core or schema:creator from schema.org .

With option (2), you would not directly link your source to your target but through an intermediary node, say :targetEntry1, which you can then relate to values and provenance:

:SOURCEDocId kb:tocMember :targetEntry1 . 
:targetEntry1 :value :TARGETDocId ;
     :createdBy :USERId .
Ivo Velitchkov
  • 2,361
  • 11
  • 21