This is a follow up to can't reproduce/verify the performance claims in graph databases and neo4j in action books. I have updated the setup and tests, and don't want to change the original question too much.
The whole story (including scripts etc) is on https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql
Short version: while trying to verify the performance claims made in the 'Graph Database' book I came to the following results (querying a random dataset containing n people, with 50 friends each):
My results for 100k people
depth neo4j mysql python
1 0.010 0.000 0.000
2 0.018 0.001 0.000
3 0.538 0.072 0.009
4 22.544 3.600 0.330
5 1269.942 180.143 0.758
"*": single run only
My results for 1 million people
depth neo4j mysql python
1 0.010 0.000 0.000
2 0.018 0.002 0.000
3 0.689 0.082 0.012
4 30.057 5.598 1.079
5 1441.397* 300.000 9.791
"*": single run only
Using 1.9.2 on a 64bit ubuntu I have setup neo4j.properties with these values:
neostore.nodestore.db.mapped_memory=250M
neostore.relationshipstore.db.mapped_memory=2048M
and neo4j-wrapper.conf with:
wrapper.java.initmemory=1024
wrapper.java.maxmemory=8192
My query to neo4j looks like this (using the REST api):
start person=node:node_auto_index(noscenda_name="person123") match (person)-[:friend]->()-[:friend]->(friend) return count(distinct friend);
Node_auto_index is in place, obviously
Is there anything I can do to speed neo4j up (to be faster then mysql)?
And also there is another benchmark in Stackoverflow with same problem.