-1

What are the algorithms for processing a large graph on a regular user computer (conventionally 8-16 GB RAM).

There is a task to process a sufficiently large graph (to calculate the PageRank) which does not fit completely into the operative memory under the given conditions.

I would like to know what algorithms exist for this or in which direction is it better to start studying? As I understand it, the algorithms for dividing the graph can help me with this, but it is not very clear how, given that it is not possible to build the entire graph in the program at once.

Perhaps there are algorithms for calculating the PageRank for each separate part of the graph and then combining the counting results.

UPD: More substantive. There is a problem of calculating Pagerank on a large graph. The counting is done in a Python program. A graph is built based on the data using networkx and the PageRank calculation will be performed using the same networkx. The problem is that there are RAM limitations, the entire graph does not fit into memory. So I wonder if there are any algorithms that would allow me to calculate PageRank for graphs smaller (subgraph?) than the original one?

Mal Mel
  • 1
  • 2

1 Answers1

0

Generally speaking, if a graph is too large to fit into memory, then it must be partitioned into multiple partitions.

Suppose the vertices can fit into memory and edges reside on disks. Each time the program loads a partition of the edge into the memory, calculate the PageRank, and then loads another partition. The xstream gives a good solution to this case: http://sigops.org/s/conferences/sosp/2013/papers/p472-roy.pdf

A more complicated case is that both vertices and edges cannot fit into memory, then both of them needs to be loaded into memory many times. The grid graph gives a good solution to this case: https://www.usenix.org/system/files/conference/atc15/atc15-paper-zhu.pdf

Ke Yang
  • 13
  • 2