I have a program that opens an embedded database and runs several queries on it. I am using one ExecutionEngine and reusing it for each query. Just running the first 3 queries, which are the simplest, takes - well, I don't know how long it takes because I stopped it after about 1/2 an hour, after which it had done only 2 queries. I have had issues with Cypher being slow on this graph before, but it's never been this bad. I am using the API for some more complicated queries, but I'd rather use Cypher for these because they are so simple. I also have some other queries that I would like to run that basically need to run through and return most of the database, some nodes multiple times.. I know this is not recommended, but I need everything laid out according to their relationships - getting every node in the graph will be entirely useless. That query take a few days, at the rate I'm going. I have no problem with what other people consider "slow" (e.g. 500 ms), b/c this is not a real-time application, but 20 minutes is excessive. What's going wrong? What am I doing wrong?
My database contains several million nodes and at least as many relationships. Neo4j is supposed to be able to handle graphs that large easily. Why am I getting such crazily long execution times?
If anyone can help me with this (maybe my queries are all wrong?), I'd really appreciate it!
Thanks, bsg Here is the code for the first three queries that take 30 minutes+ together. It runs each one and prints the result (a simple count) to a file.
ExecutionEngine eng = new ExecutionEngine(graphdb);
String filepath = resultstring + "basicstats.txt";
PrintWriter basics = new PrintWriter(resultstring + "basicstats.txt");
String querystring = "START user=node:userIndex(\"Username:*\")" +
" WHERE has(user.FullNodeCreationTime) "
+ " RETURN COUNT(user) AS numcrawled";
ExecutionResult result = eng.execute(querystring);
basics.print("Number of users crawled: ");
basics.println(result.iterator().next().get("numcrawled"));
String otherusers = "START user=node:userIndex(\"Username:*\")" +
" WHERE NOT has(user.FullNodeCreationTime)" +
" RETURN COUNT(user) AS numtouched";
result = eng.execute(otherusers);
basics.print("Number of users touched (not crawled): ");
basics.println(result.iterator().next().get("numtouched"));
String partialinfousers = "START user=node:userIndex(\"Username:*\")" +
" WHERE NOT has(user.FullNodeCreationTime) AND NOT has(user.NumFollowers)" +
" RETURN COUNT(user.Username) AS numcrawled";
result = eng.execute(partialinfousers);
basics.print("Number of users with partial info: ");
basics.println(result.iterator().next().get("numcrawled"));
basics.close();