How can I dump embedded Blazegraph contents to an RDF file?

Question

I have created a blazegraph RDF4J repository and connection in Scala:

val props = new Properties()
props.put(Options.BUFFER_MODE, BufferMode.DiskRW)
props.put(Options.FILE, "embedded.jnl")
var sail = new BigdataSail(props)
var repo = new BigdataSailRepository(sail)
repo.initialize()
var cxn = repo.getConnection()

I can add statements, retrieve SPARQL results, etc.

Now I'd like to dump the contents of the repository to an RDF file, like this:

Rio.write(model, System.out, RDFFormat.RDFXML);

But if I try to substitute my cxn or repo for the expected model argument, Eclipse complains:

overloaded method value write with alternatives: (x$1: Iterable[org.openrdf.model.Statement],x$2: java.io.Writer,x$3: org.openrdf.rio.RDFFormat)Unit (x$1: Iterable[org.openrdf.model.Statement],x$2: java.io.OutputStream,x$3: org.openrdf.rio.RDFFormat)Unit cannot be applied to (com.bigdata.rdf.sail.BigdataSailRepository, java.io.FileOutputStream, org.openrdf.rio.RDFFormat).

How do I get from the repo and connection that I have to a model expected by Rio.write()? Or can I dump the triples in some other way?

`cxn` is just a connection, isn't it? How do you expect should this be dumped? See http://docs.rdf4j.org/javadoc/latest/org/eclipse/rdf4j/rio/Rio.html for all write methods I can't see anything that dumps the whole repository to a file. At least not with the RIO class. — UninformedUser, May 11 '17 at 19:34
yeah, I guess dumping the connection was a longshot. I was hoping that there was something for the repository, though. Thanks for checking. I'm also importing some of Blazegraph's methods from com.bigdata. I guess I'll look there next. — Mark Miller, May 11 '17 at 19:43
Maybe there is some CLI tool like e.g. for MySQL? You should definitely ask the Blazegraph support on the mailing list. I'm pretty sure they can help you — UninformedUser, May 12 '17 at 05:37

score 2 · Accepted Answer · answered May 12 '17 at 09:45

It is quite nicely described here http://docs.rdf4j.org/programming/ point 3.2.8. Using RDFHandlers

import org.eclipse.rdf4j.rio.Rio;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.RDFWriter;

try (RepositoryConnection conn = repo.getConnection()) {
RDFWriter writer = Rio.createWriter(RDFFormat.TURTLE, System.out);
conn.prepareGraphQuery(QueryLanguage.SPARQL,
   "CONSTRUCT {?s ?p ?o } WHERE {?s ?p ?o } ").evaluate(writer);
}

And instead of System.out write to a file.

Mark Miller · Answer 2 · 2017-05-12T13:18:04.717

This Scala code worked for me. It's entirely based on ChristophE's answer. I already had a connection, but I did need to create a file output stream. I removed the try wrapper since there wasn't any catch block. Not recommended for production!

var out = new FileOutputStream("rdf.ttl") 
var writer = Rio.createWriter(RDFFormat.TURTLE, out)
cxn.prepareGraphQuery(QueryLanguage.SPARQL, 
    "CONSTRUCT {?s ?p ?o } WHERE {?s ?p ?o } ").evaluate(writer)

Jeen Broekstra · Answer 3 · 2017-05-13T01:24:13.213

2

Yet another way to achieve this is as follows:

var out = new FileOutputStream("rdf.ttl") 
Rio.write(cxn.getStatements(null,null,null), out, RDFFormat.TURTLE)

This works because the output of getStatements is a RepositoryResult object, which inherits from Iteration<Statement>, and as such can be fed directly into the RDFHandler.

You can also do this:

var writer = Rio.createWriter(RDFFormat.TURTLE, out)
cxn.export(writer)

The advantage of using export over getStatements is that it will also write any namespace declarations existing in your repository to the file.

The advantage of either of these approaches over the other answers is that you bypass the SPARQL query parser altogether - so it's more efficient for large repos.

edited May 13 '17 at 01:24

answered May 13 '17 at 01:04

Jeen Broekstra

21,642
4
51
73

Thanks, this was really the intent of the question. In my hands, `cxn.export(writer)` performs the dump, but without prefix (namespace?) definitions. For `cxn.getStatements(null,null,null)`, I get `not enough arguments for method getStatements: (x$1: org.openrdf.model.Resource, x$2: org.openrdf.model.URI, x$3: org.openrdf.model.Value, x$4: Boolean, x$5: org.openrdf.model.Resource*)org.openrdf.repository.RepositoryResult[org.openrdf.model.Statement]. Unspecified value parameters x$4, x$5.` I'm using – Mark Miller May 13 '17 at 13:49
My build.sbt requests `"com.blazegraph" % "bigdata-core" % "2.1.4"`, which is pulling sesame-*-2.7.12.jar, ___not RDF4J 2.2.1.___ – Mark Miller May 13 '17 at 13:59
That sounds like a Blazegraph distribution issue. I'd contact them directly about this. – Jeen Broekstra May 13 '17 at 22:16
@MarkMiller with respect to the second error (not enough arguments), that might be because the `includeInferred` parameter in the `getStatements` method was only made optional in more recent versions of RDF4J. You could try `getStatements(null,null,null,true)` instead. – Jeen Broekstra May 14 '17 at 23:19

How can I dump embedded Blazegraph contents to an RDF file?

3 Answers3