4

I am writing a small code (sequential) to calculate Page Rank for a modest dataset (although not completely trivial).

The algo goes like this :

while ( not converged ) {
   // Do a bunch of things to calculate PR
}

I am clear on the algorithm apart from the 'convergence' criteria. What is the best way to check if the algorithm has converged? Should I :

Check I keep a copy of all individual node's PR from an iteration and check all node's PR in the next iteration to be the same value?

This seems highly inefficient to me. Is this a right way to do it?

Nitish Upreti
  • 6,312
  • 9
  • 50
  • 92
  • 2
    Why does it feel inefficient? It is really just another `float` per vertex, this is nothing compared to the structure of the graph. For the computation, you just calculate a difference between two `floats`, so this is also nothing compared to the rest of the math that you need to do ;) – Thomas Jungblut Jan 29 '15 at 21:24
  • Iterating over each node in two separate HashMaps for every iteration ( for the PR comparison) seemed like something I could optimize on. – Nitish Upreti Jan 29 '15 at 21:35
  • 1
    You can compute the difference while you are computing the page rank (which already requires one iteration) – Thomas Jungblut Jan 29 '15 at 21:45
  • Yes! You are right. I figured it out. Thanks ! :) – Nitish Upreti Jan 29 '15 at 22:05

1 Answers1

5

For each node take the difference in score between the current iteration and the last one, if this error falls below a certain threshold the graph has converged.

The paper for TextRank describes the quite well:

Starting from arbitrary values assigned to each node in the graph, the computation iterates until convergence below a given threshold is achieved.

Convergence is achieved when the error rate for any vertex in the graph falls below a given threshold. The error rate of a vertex is defined as the difference between the “real” score of the vertex S(Vi) and the score computed at iteration k, S^K(Vi) . Since the real score is not known apriori, this error rate is approximated with the difference between the scores computed at two successive iterations: S^(k+1)(Vi)+S^(k)(Vi).

Community
  • 1
  • 1
jksnw
  • 648
  • 1
  • 7
  • 19