How to understand the maxIterations in pregel implement of Apache GraphX

Question

The official explanation is that maxIterations would be used for the non-convergent algorithms. My question is: if I don't know my algorithm's astringency, how should I set the value of maxIterations? And, if there is a convergent algorithm, so that what's the meaning of this value?

BTW, I also confused about the 'iteration' of pregel here. How does the code execute count as an iteration?

Here is part of the pregel source code:

// Loop
var prevG: Graph[VD, ED] = null
var i = 0
while (activeMessages > 0 && i < maxIterations) {
  // Receive the messages and update the vertices.
  prevG = g
  g = g.joinVertices(messages)(vprog)
  graphCheckpointer.update(g)

  val oldMessages = messages
  // Send new messages, skipping edges where neither side received a message. We must cache
  // messages so it can be materialized on the next line, allowing us to uncache the previous
  // iteration.
  messages = GraphXUtils.mapReduceTriplets(
    g, sendMsg, mergeMsg, Some((oldMessages, activeDirection)))
  // The call to count() materializes `messages` and the vertices of `g`. This hides oldMessages
  // (depended on by the vertices of g) and the vertices of prevG (depended on by oldMessages
  // and the vertices of g).
  messageCheckpointer.update(messages.asInstanceOf[RDD[(VertexId, A)]])
  activeMessages = messages.count()

  logInfo("Pregel finished iteration " + i)

  // Unpersist the RDDs hidden by newly-materialized RDDs
  oldMessages.unpersist(blocking = false)
  prevG.unpersistVertices(blocking = false)
  prevG.edges.unpersist(blocking = false)
  // count the iteration
  i += 1
}

Thank you for your generous answers :)

score 0 · Answer 1 · answered Apr 10 '20 at 16:39

maxIterations is used to make sure the algorithm terminate. Notice that Pregel is just a paradigm so its convergence depends on your algorithm (sendMessage and vertexProgram). That is why we use Int.MaxValue as a max number of iterations when we are sure our algorithm will converge.

If you are not sure about your algorithm's terminaison, it is better to set it based on empirical tests.. for example, if your algorithm is a heuistic that optimizes some value, it is obvious that the higher the maxiterations, the closer you are to your target. Here, you decide when to stop based on how much time and resources you are willing to use get an answer (e.g. 100 iterations).

Finally, the code uses the variable i to count the iteration number and it is incremented on each iteration. Pregel stops when i reaches max iterations or even before when there are no messages exchanged (when the algorithm converges).

How to understand the maxIterations in pregel implement of Apache GraphX

1 Answers1