2

I was trying having some tests on Neo4j calculating shortest path between 2 nodes.

  1. With 100k nodes and 10 million edges (100 edges each node), shortest path algo was run in 0.4-3s
  2. With 200k nodes and 40 million edges (200 edges each node), it takes at least from 40s or more.

My computer obviously isn't intended for Big Data analysis, but I don't even know if buying a server with 128GB ram a bunch of processor more, could solve the second test in a reasonable time. (Do you think it could?)

Certainly with 1 million nodes or more, neo4j will not help me out anymore. I have spent many hours looking online for some way to use Giraph like Neo4J: having some sort of API, (even in java) through which I can run a query and output a result. But nothing..

Thanks in advance

peter
  • 14,348
  • 9
  • 62
  • 96
M4rk
  • 2,172
  • 5
  • 36
  • 70
  • A few questions, so that we get a better understanding of the context. (1) Are you doing shortest path between: (a) every pair of nodes, or (b) 2 specific nodes? (2) If the latter, have you indexed your data so that the 2 specific nodes can be found quickly (or, are you using something like `START n=node(123)` to identify the specific nodes)? – cybersam May 09 '14 at 19:35
  • Shortest path between 2 nodes. I use indexes :) – M4rk May 10 '14 at 10:45
  • Can you add more information to your question? Like your graph model, the code you run to compute the shortest path etc.? – Michael Hunger May 11 '14 at 21:55
  • Do you limit your shortest path to something like max-hops of 4 or 10 ? – Michael Hunger May 11 '14 at 21:55
  • Which version of Neo4j do you use? – Michael Hunger May 11 '14 at 21:56
  • I use cypher "shortestPath" algorithm, with the last version of Neo4j. I do not want to set max-hops to 3 or 4 because I want always a shortest path. For this reason I thought Giraph a valid solution – M4rk May 12 '14 at 17:42

0 Answers0