0

I have a large RDF file (in the gigabytes) that I'd like imported into a remote graph database.

The database exposes a Graph Store Protocol endpoint over the RDF4J API. Of the many ingest routes the database supports, the only one acceptable in my scenario is using this endpoint (posting to /statements).

The database is hosted on a cloud provider, and various infrastructure layers (web server, application container) impose upload limits, so I can't just post the file.


Using dotNetRDF, how can I load a lot of RDF into a remote database over Graph Store in chunks?

Samu Lang
  • 2,261
  • 2
  • 16
  • 32

1 Answers1

2

WriteToStoreHandler writes RDF direct to a storage provider with configurable batch size.

SesameHttpProtocolConnector is a storage provider that supports the RDF4J API, which includes Graph Store Protocol.

var path = "large-rdf-file-path.nt";
var server = "http://example.com/sparql-protocol-base-url";
var repository = "repository-name";
var batchSize = 10;

using (var connector = new SesameHttpProtocolConnector(server, repository))
{
    var handler = new WriteToStoreHandler(connector, batchSize);

    using (var reader = File.OpenText(path))
    {
        var parser = new NTriplesParser();

        parser.Load(handler, reader);
    }
}
Samu Lang
  • 2,261
  • 2
  • 16
  • 32
  • 1
    Bear in mind that if the data contains blank nodes this might change the data because there is no guarantee that the remote store treats blank node IDs the same across separate HTTP requests – RobV Jan 09 '18 at 09:58