I am trying to think of possible ways to solve such problem where there is a huge graph which can't fit into one machine. How would one run any algorithm like BFS or DFS on such graph.
Asked
Active
Viewed 51 times
0
-
It would fetch the data from disk when needed. – Henry Jul 15 '18 at 08:22
-
1Do you mean it won't fit into memory, or that it won't even fit into storage? – arekolek Jul 15 '18 at 08:23
-
1Cache as many nodes as practical on the local machine. Fetch nodes from remote storage when needed. Dump the [least-recently-used](https://en.wikipedia.org/wiki/Cache_replacement_policies#LRU) node from cache to make room for the new node. Or plan B: buy more memory. – user3386109 Jul 15 '18 at 09:17
-
@arekolek: Yes assuming my graph is of size peta bytes. which might not fit into single machine storage. – rgaut Jul 15 '18 at 20:14
-
@user3386109 sounds interesting but how about the scenario when one fan out of BFS explores so much data which is not available on same host storage itself? I do looked into few theoretical ideas like developing I/O model or cache oblivious data structure etc. – rgaut Jul 15 '18 at 20:19
-
1Perhaps you'd be interested in distributed algorithms. A distributed version of an algorithm could be used to find a solution for it while using multiple computers. Distributed algorithms work basically so that every node in the graph runs it's own copy of the algorithm. The solution emerges by the nodes communicating with each other. – Mederr Jul 16 '18 at 11:44