0

I am new to GraphX and I do not understand the Vertex Program and Merge Message part in Pregel API. Do not do they the same thing ? For example what is the difference between Vertex Program and Merge Message part in the following Pregel code taken from the Spark website?

import org.apache.spark.graphx._
// Import random graph generation library
import org.apache.spark.graphx.util.GraphGenerators
// A graph with edge attributes containing distances
val graph: Graph[Long, Double] =
  GraphGenerators.logNormalGraph(sc, numVertices = 100).mapEdges(e => e.attr.toDouble)
val sourceId: VertexId = 42 // The ultimate source
// Initialize the graph such that all vertices except the root have distance infinity.
val initialGraph = graph.mapVertices((id, _) => if (id == sourceId) 0.0 else Double.PositiveInfinity)
val sssp = initialGraph.pregel(Double.PositiveInfinity)(
  (id, dist, newDist) => math.min(dist, newDist), **// Vertex Program**
  triplet => {  // Send Message
    if (triplet.srcAttr + triplet.attr < triplet.dstAttr) {
      Iterator((triplet.dstId, triplet.srcAttr + triplet.attr))
    } else {
      Iterator.empty
    }
  },
  (a,b) => math.min(a,b) **// Merge Message**
  )
println(sssp.vertices.collect.mkString("\n"))
Morteza Mashayekhi
  • 934
  • 11
  • 23

1 Answers1

3

For one thing, the mergeMsg part has no access to the context of any Vertex -- it just takes individual messages and creates a single message. That message in turn gets sent to the vprog as a single message.

So, the vprog has no access to individual messages, just the total (whatever that means). And the mergeMsg can only take two messages and create one message. mergeMessage happens until there is only one message left -- the total -- which as I said gets passed to vprog.

David Griffin
  • 13,677
  • 5
  • 47
  • 65