0

I have the following code that creates a graph database using Titan:

public class TitanMassiveInsertion {

private TitanGraph titanGraph = null;
private BatchGraph<TitanGraph> batchGraph = null;

public static void main(String args[]) {
    TitanMassiveInsertion test = new TitanMassiveInsertion();
    test.startup("data/titanDB");
    test.createGraph("data/flickrEdges.txt");
    test.shutdown();
}
public void startup(String titanDBDir) {
    System.out.println("The Titan database is now starting . . . .");
    BaseConfiguration config = new BaseConfiguration();
    Configuration storage = config.subset(GraphDatabaseConfiguration.STORAGE_NAMESPACE);
    storage.setProperty(GraphDatabaseConfiguration.STORAGE_BACKEND_KEY, "local");
    storage.setProperty(GraphDatabaseConfiguration.STORAGE_DIRECTORY_KEY, titanDBDir);
    storage.setProperty(GraphDatabaseConfiguration.STORAGE_BATCH_KEY, true);
    Configuration index = storage.subset(GraphDatabaseConfiguration.INDEX_NAMESPACE).subset("nodes");
    index.setProperty(INDEX_BACKEND_KEY, "elasticsearch");
    index.setProperty("local-mode", true);
    index.setProperty("client-only", false);
    index.setProperty(STORAGE_DIRECTORY_KEY, titanDBDir + File.separator + "es");

    titanGraph = TitanFactory.open(config);
    titanGraph.makeKey("nodeId").dataType(String.class).indexed(Vertex.class).make();
    titanGraph.makeLabel("similar").oneToOne().make();
    titanGraph.commit();
    batchGraph = new BatchGraph<TitanGraph>(titanGraph, VertexIDType.STRING, 1000);
    batchGraph.setVertexIdKey("nodeId");
    batchGraph.setLoadingFromScratch(true);

}

public void shutdown() {
    System.out.println("The Titan database is now shuting down . . . .");
    if(titanGraph != null) {
        batchGraph.shutdown();
        titanGraph.shutdown();
        batchGraph = null;
        titanGraph = null;
    }
}

public void createGraph(String datasetDir) {
    System.out.println("Creating the Titan database . . . .");      
    try {
        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(datasetDir)));
        String line;
        int lineCounter = 1;
        int nodeCounter = 0;
        Vertex srcVertex, dstVertex;
        while((line = reader.readLine()) != null) {
            if(lineCounter > 4) {
                String[] parts = line.split("\t");

                srcVertex = batchGraph.getVertex(parts[0]);
                if(srcVertex == null) {
                    srcVertex = batchGraph.addVertex(parts[0]);
                    srcVertex.setProperty("nodeId", parts[0]);
                    nodeCounter++;
                }
                dstVertex = batchGraph.getVertex(parts[1]);
                if(dstVertex == null) {
                    dstVertex = batchGraph.addVertex(parts[1]);
                    dstVertex.setProperty("nodeId", parts[1]);
                    nodeCounter++;
                }
                Edge edge = batchGraph.addEdge(null, srcVertex, dstVertex, "similar");

                //System.out.println(edge);
                System.out.println(nodeCounter);
            }
            lineCounter++;
        }
        reader.close();
    }
    catch(IOException ioe) {
        ioe.printStackTrace();
    }
}
}

After the graph database creation I want to test my code to see everything is ok, so I am trying to print the vertices with this test code:

TitanGraph graph = TitanFactory.open("data/titanDB");
    for(Vertex v : graph.getVertices()) {
        System.out.println(v.getProperty("nodeId"));
    }
    graph.shutdown();

But it prints nothing and it seems to stack in an infinite loop. I also tried Vertex v : graph.query().vertices(), as the tutorial suggests, but the problem exists. On the other hand when I System.out.println(graph.getVertices("nodeId", "2604051056")); I get the proper output. Is there a problem with getVertices() function?

salvador
  • 1,079
  • 3
  • 14
  • 28

1 Answers1

1

You are using the "local" storage backend which means BerkeleyJE by default. BerkeleyJE has transactions enabled by default which means that it has to keep track of every data element it touches to provide the correct isolation. This, however, is expensive when you are essentially touching all data with a vertex scan.

Hence, try disabling transactions

storage.transactions = false
  • I thought that when I enable batch-loading the transactions automatically are disabled. Isn't that right? – salvador Dec 05 '13 at 22:37