2

I have an rdf file with the following content:

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
    <rdf:Description rdf:about="http://someurl.com/def/elementtype/projectState">
        <rdfs:domain rdf:nodeID="projectState_0" />
    </rdf:Description>
</rdf:RDF>

which is parsed by the following code:

import rdflib

g = rdflib.Graph()

with open("problem/err.rdf", 'r', encoding='UTF-8') as fp:
    g.load(fp, format='application/rdf+xml')

for s, p, o in g:
    print(f"subject:{s}")
    print(f"predicate:{p}")
    print(f"object:{o}")
    print()

I'd expect the predicate to expose the attribute nodeID but I did not find a way to get it. The documentation also doesn't acknowledge xml attributes on BNodes (blank nodes without content).

Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Lukas Schmid
  • 1,895
  • 1
  • 6
  • 18

1 Answers1

0

Blank node subjects generally aren't promised to be preserved when importing graphs (some graph databases like GraphDB do offer to option to). When I run the code the first time, the output is

subject:http://someurl.com/def/elementtype/projectState
predicate:http://www.w3.org/2000/01/rdf-schema#domain
object:N4ae82de375104726a1a2e5344ee6a44e

When I run it a second time, the output is

subject:http://someurl.com/def/elementtype/projectState
predicate:http://www.w3.org/2000/01/rdf-schema#domain
object:N79f7d744f68f439388484f02a9367be5

So regarding the question of exposing the nodeId, it is-it's just not respecting the identifier that you gave to it. See more information with this issue.

I would suggest

i. Using a different graph database that supports blank node preservation

ii. Use an XML parser

iii. Elevate the blank node to an rdf:resource

Thomas
  • 720
  • 9
  • 22