I am trying to decide which platform to use for a project I am working on and was wondering if anybody had some input. I have a set of really large data (about 5 million rows long) and want to be able to run an algorithm on it. I already developed a java program which works on small data by using sparse matrices and am having trouble scaling up. I am also going to be wanting to visualize it later on, since the algorithm uses bipartite graphing methods to cluster my data. I am looking into neo4j as platforms to use but am unsure if I would be able to do the computations or not. So my question is really how complex of an algorithm can Neo4j handle? Any suggestions are welcome!
Asked
Active
Viewed 41 times
0
-
your question is a bit vague. what are you looking for in an answer? 5 million doesn't seem that big--it should all fit in memory, for example, in a reasonable server, using neo4j. neo4j has a bunch of built-in algorithms (which also serve as examples for custom algorithms): https://github.com/neo4j/neo4j/tree/2.0/community/graph-algo/src/main/java/org/neo4j/graphalgo/impl – Eve Freeman Jul 08 '13 at 17:37
-
5 million would be the smallest instance. The algorithm that I want to run on it involves putting the data into sparse matrices and multiplying/running loops on it until it converges. Do you think I could recreate that in neo4j or would it be better to just look at using something like Graphchi instead? – kelly short Jul 08 '13 at 18:50