I am using Blazegraph's 1.5.3 version of their Bigdata DB (now rebranded as Blazegraph). I have a service that acts as a Gateway, implementing a bunch of persistence-layer methods. Now I'm writing unit tests for those methods. I'm using the embedded setup with Jetty. My setup code is below:
int port = 0; // random port
String namespace = "kb";
int queryThreadPoolSize = ConfigParams.DEFAULT_QUERY_THREAD_POOL_SIZE;
boolean forceOverflow = false;
String servletContextListenerClass = ConfigParams.DEFAULT_SERVLET_CONTEXT_LISTENER_CLASS;
System.setProperty(SystemProperties.JETTY_XML, "jetty.xml");
String propertyFile = "RWStore.properties";
System.setProperty(SystemProperties.BIGDATA_PROPERTY_FILE, propertyFile);
final Map<String, String> initParams = new LinkedHashMap<>();
initParams.put("propertyFile", propertyFile);
initParams.put("namespace", namespace);
initParams.put("queryThreadPoolSize", Integer.toString(queryThreadPoolSize));
initParams.put("forceOverflow", Boolean.toString(forceOverflow));
initParams.put("servletContextListenerClass", servletContextListenerClass);
sparqlServer = NanoSparqlServer.newInstance(port, journal, initParams);
LOGGER.info("Waiting for NanoSparqlServer to start...");
NanoSparqlServer.awaitServerStart(sparqlServer);
serverUrl = sparqlServer.getURI().toString();
LOGGER.info("NanoSparqlServer started on: " + serverUrl + '\n');
I am using the default jetty.xml
configuration from the com.bigdata 1.5.3 jar:
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<!-- See http://www.eclipse.org/jetty/documentation/current/ -->
<!-- See http://wiki.eclipse.org/Jetty/Reference/jetty.xml_syntax -->
<Configure id="Server" class="org.eclipse.jetty.server.Server">
<!-- =========================================================== -->
<!-- Configure the Server Thread Pool. -->
<!-- The server holds a common thread pool which is used by -->
<!-- default as the executor used by all connectors and servlet -->
<!-- dispatches. -->
<!-- -->
<!-- Configuring a fixed thread pool is vital to controlling the -->
<!-- maximal memory footprint of the server and is a key tuning -->
<!-- parameter for tuning. In an application that rarely blocks -->
<!-- then maximal threads may be close to the number of 5*CPUs. -->
<!-- In an application that frequently blocks, then maximal -->
<!-- threads should be set as high as possible given the memory -->
<!-- available. -->
<!-- -->
<!-- Consult the javadoc of o.e.j.util.thread.QueuedThreadPool -->
<!-- for all configuration that may be set here. -->
<!-- =========================================================== -->
<Arg name="threadpool"><New id="threadpool" class="org.eclipse.jetty.util.thread.QueuedThreadPool"/></Arg>
<Get name="ThreadPool">
<Set name="minThreads" type="int"><Property name="jetty.threads.min" default="10"/></Set>
<Set name="maxThreads" type="int"><Property name="jetty.threads.max" default="64"/></Set>
<Set name="idleTimeout" type="int"><Property name="jetty.threads.timeout" default="60000"/></Set>
<Set name="detailedDump">false</Set>
</Get>
<!-- =========================================================== -->
<!-- Get the platform mbean server -->
<!-- =========================================================== -->
<Call id="MBeanServer" class="java.lang.management.ManagementFactory"
name="getPlatformMBeanServer" />
<!-- =========================================================== -->
<!-- Initialize the Jetty MBean container -->
<!-- =========================================================== -->
<!-- Note: This breaks CI if it is enabled
<Call name="addBean">
<Arg>
<New id="MBeanContainer" class="org.eclipse.jetty.jmx.MBeanContainer">
<Arg>
<Ref refid="MBeanServer" />
</Arg>
</New>
</Arg>
</Call>-->
<!-- Add the static log to the MBean server.
<Call name="addBean">
<Arg>
<New class="org.eclipse.jetty.util.log.Log" />
</Arg>
</Call>-->
<!-- For remote MBean access (optional)
<New id="ConnectorServer" class="org.eclipse.jetty.jmx.ConnectorServer">
<Arg>
<New class="javax.management.remote.JMXServiceURL">
<Arg type="java.lang.String">rmi</Arg>
<Arg type="java.lang.String" />
<Arg type="java.lang.Integer"><SystemProperty name="jetty.jmxrmiport" default="1090"/></Arg>
<Arg type="java.lang.String">/jndi/rmi://<SystemProperty name="jetty.jmxrmihost" default="localhost"/>:<SystemProperty name="jetty.jmxrmiport" default="1099"/>/jmxrmi</Arg>
</New>
</Arg>
<Arg>org.eclipse.jetty.jmx:name=rmiconnectorserver</Arg>
<Call name="start" />
</New>-->
<!-- =========================================================== -->
<!-- Http Configuration. -->
<!-- This is a common configuration instance used by all -->
<!-- connectors that can carry HTTP semantics (HTTP, HTTPS, SPDY)-->
<!-- It configures the non wire protocol aspects of the HTTP -->
<!-- semantic. -->
<!-- -->
<!-- Consult the javadoc of o.e.j.server.HttpConfiguration -->
<!-- for all configuration that may be set here. -->
<!-- =========================================================== -->
<New id="httpConfig" class="org.eclipse.jetty.server.HttpConfiguration">
<Set name="secureScheme">https</Set>
<Set name="securePort"><Property name="jetty.secure.port" default="8443" /></Set>
<Set name="outputBufferSize"><Property name="jetty.output.buffer.size" default="32768" /></Set>
<Set name="requestHeaderSize"><Property name="jetty.request.header.size" default="8192" /></Set>
<Set name="responseHeaderSize"><Property name="jetty.response.header.size" default="8192" /></Set>
<Set name="sendServerVersion"><Property name="jetty.send.server.version" default="true" /></Set>
<Set name="sendDateHeader"><Property name="jetty.send.date.header" default="false" /></Set>
<Set name="headerCacheSize">512</Set>
<!-- Uncomment to enable handling of X-Forwarded- style headers
<Call name="addCustomizer">
<Arg><New class="org.eclipse.jetty.server.ForwardedRequestCustomizer"/></Arg>
</Call>
-->
</New>
<!-- Configure the HTTP endpoint. -->
<Call name="addConnector">
<Arg>
<New class="org.eclipse.jetty.server.ServerConnector">
<Arg name="server"><Ref refid="Server" /></Arg>
<Arg name="factories">
<Array type="org.eclipse.jetty.server.ConnectionFactory">
<Item>
<New class="org.eclipse.jetty.server.HttpConnectionFactory">
<Arg name="config"><Ref refid="httpConfig" /></Arg>
</New>
</Item>
</Array>
</Arg>
<Set name="host"><SystemProperty name="jetty.host" /></Set>
<Set name="port"><SystemProperty name="jetty.port" default="9999" /></Set>
<Set name="idleTimeout"><SystemProperty name="http.timeout" default="30000"/></Set>
</New>
</Arg>
</Call>
<!-- =========================================================== -->
<!-- Set handler Collection Structure -->
<!-- =========================================================== -->
<Set name="handler">
<New id="Handlers" class="org.eclipse.jetty.server.handler.HandlerCollection">
<Set name="handlers">
<Array type="org.eclipse.jetty.server.Handler">
<Item>
<New id="Contexts" class="org.eclipse.jetty.server.handler.ContextHandlerCollection">
<Call name="addHandler">
<Arg>
<!-- This is the redirect from root to /bigdata -->
<New id="moved" class="org.eclipse.jetty.server.handler.MovedContextHandler">
<Set name="contextPath">/</Set>
<Set name="newContextURL">/bigdata</Set>
<Set name="permanent">true</Set>
<Set name="discardPathInfo">false</Set>
<Set name="discardQuery">false</Set>
</New>
</Arg>
</Call>
<Call name="addHandler">
<Arg>
<!-- This is the bigdata web application. -->
<New id="WebAppContext" class="org.eclipse.jetty.webapp.WebAppContext">
<Set name="war"><SystemProperty name="jetty.resourceBase" default="bigdata-war/src"/></Set>
<Set name="contextPath">/bigdata</Set>
<Set name="descriptor">WEB-INF/web.xml</Set>
<Set name="parentLoaderPriority">true</Set>
<Set name="extractWAR">false</Set>
<Set name="overrideDescriptor"><SystemProperty name="jetty.overrideWebXml" default="bigdata-war/src/WEB-INF/override-web.xml"/></Set>
<Set name="maxFormContentSize">10485760</Set>
</New>
</Arg>
</Call>
</New>
</Item>
</Array>
</Set>
</New>
</Set>
<!-- =========================================================== -->
<!-- extra server options -->
<!-- =========================================================== -->
<Set name="stopAtShutdown">true</Set>
<Set name="stopTimeout">5000</Set>
<Set name="dumpAfterStart"><Property name="jetty.dump.start" default="false"/></Set>
<Set name="dumpBeforeStop"><Property name="jetty.dump.stop" default="false"/></Set>
</Configure>
... and I am using the default RWStore.properties from the same jar:
#
# Note: These options are applied when the journal and the triple store are
# first created.
##
## Journal options.
##
# The backing file. This contains all your data. You want to put this someplace
# safe. The default locator will wind up in the directory from which you start
# your servlet container.
com.bigdata.journal.AbstractJournal.createTempFile=true
# The persistence engine. Use 'Disk' for the WORM or 'DiskRW' for the RWStore.
com.bigdata.journal.AbstractJournal.bufferMode=DiskRW
# Setup for the RWStore recycler rather than session protection.
com.bigdata.service.AbstractTransactionService.minReleaseAge=1
# Enable group commit. See http://wiki.blazegraph.com/wiki/index.php/GroupCommit and BLZG-192.
#com.bigdata.journal.Journal.groupCommit=false
com.bigdata.btree.writeRetentionQueue.capacity=4000
com.bigdata.btree.BTree.branchingFactor=128
# 200M initial extent.
com.bigdata.journal.AbstractJournal.initialExtent=209715200
com.bigdata.journal.AbstractJournal.maximumExtent=209715200
##
## Setup for QUADS mode without the full text index.
##
com.bigdata.rdf.sail.truthMaintenance=false
com.bigdata.rdf.store.AbstractTripleStore.quads=true
com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false
com.bigdata.rdf.store.AbstractTripleStore.textIndex=false
com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms
# Bump up the branching factor for the lexicon indices on the default kb.
com.bigdata.namespace.kb.lex.com.bigdata.btree.BTree.branchingFactor=400
# Bump up the branching factor for the statement indices on the default kb.
com.bigdata.namespace.kb.spo.com.bigdata.btree.BTree.branchingFactor=1024
# Uncomment to enable collection of OS level performance counters. When
# collected they will be self-reported through the /counters servlet and
# the workbench "Performance" tab.
#
# com.bigdata.journal.Journal.collectPlatformStatistics=true
Using these configurations, the server starts up fine, and I can access the Web console, make queries, and interact via the BigdataGraphClient
API in Java. Now I'm just trying to figure out how to clear out the graph to avoid data leakage between unit tests. I've tried the following:
Use the
BigdataGraphClient
Java API to remove all edges and vertices. Leaves some of these edges and vertices in place, for reasons unknown to me.graph.getEdges.forEach(Edge::remove) graph.getVertices.forEach(Vertex::remove)
Stop and destroy server. Leaves the journal file in place.
sparqlServer.stop(); sparqlServer.destroy();
Use a temporary journal file by setting
com.bigdata.journal.AbstractJournal.createTempFile=true
and commenting outcom.bigdata.journal.AbstractJournal.file=bigdata.jnl
. This clears the journal file, but it throws aDatasetNotFoundException
after the first test.Put the journal file in a temporary directory in
/tmp/bigdata-test/bigdata.jnl
and delete/recreate that directory between tests. This has the same problem as #2.Tried to create my own
Journal
object and pass that in as theIndexManager
parameter of theNanoSparqlServer.newInstance
method. This fails due to a known issue with old Lucene dependencies. I cannot include these in my project because I am relying on the newer version of Lucene that conflicts with this one. The error thrown is the same as documented in the referenced Jira ticket.
Anybody know of a clean, reliable way to clear the graph between tests (in a tearDown
method run after every test)?