1

I am trying to load an external vocabulary (http://purl.org/eis/vocab/daq#) - this vocabulary has valid syntax and can be dereferenced and has both an RDF/XML and TURTLE serialisation - using the RDFDataMgr, however, I am getting a RIOTException:

org.apache.jena.riot.RiotException: [line: 23, col: 1 ] Broken token (newline): The Dataset Quality Vocabulary (daQ) is a lightweight, extensible core vocabulary for attaching the result of quality benchmarking of a linked open 

This exception is only happening when I try to load it externally. Loading works when I use my local copy of the vocabulary. I was using the 2.11.1 version of Jena, but yesterday I updated the code to the latest 3.3.0, but I still had the same exception. I am thinking that this might be a problem related to how the Jena mechanics and external libraries are reading non-local sources. Does anyone have an idea how this can be fixed?

Thanks Jeremy

jerdeb
  • 115
  • 1
  • 6

1 Answers1

1

The endpoint returns Turtle that is broken and not valid syntax. There are raw newlines in the string around line 23. Replace with """-quoting or fix the data.

The RDF/XML is OK.

Use RDFParser to build a parser process that sets the "accept" header to "application/rdf+xml". The default used by RDFDataMgr prefers Turtle.

AndyS
  • 16,345
  • 17
  • 21
  • But using an `RDFParser`, then I would need to "stream" the data to build a model right? Is there a way RDFDataMgr is forced to accept for example rdf+xml? I thought that RDFDataMgr.loadModel("someuri",Lang.RDFXML) does that – jerdeb Jul 12 '17 at 15:35
  • `RDFDataMgr` uses `RDFParser` itself. `StreamRDFLib.graph(model.getGraph())` will create a destination for parsing that goes to a model, for example. `RDFDataMgr`itself does not have a function to set the header (which is in `WebContent.defaultRDFAcceptHeader`); it is a convenience library on top of the core machinery. Turtle is preferred because it is faster and more robust. There is little Jena can do automatically and safely in the case of bad data (often due to transmission errors). – AndyS Jul 13 '17 at 07:56