How to use combiner in aggregateMessages in GraphX

Question

In GraphX aggregateMessages API

class Graph[VD, ED] {
  def aggregateMessages[Msg: ClassTag](
      sendMsg: EdgeContext[VD, ED, Msg] => Unit,
      mergeMsg: (Msg, Msg) => Msg,
      tripletFields: TripletFields = TripletFields.All)
    : VertexRDD[Msg]
}

However I want to modify the return type of merge stage which means I want something like combineByKey instead of reduceByKey, how can I do it based on the advantage of GraphX? Or in other words, how can I just use the result of sendMsg and skip the mergeMsg stage of this function?

The advantage of GraphX I mean is "vertex centric", if I use map and combineByKey function it will do global shuffling, which cost plenty of time, going against the idea "vertex centric"

I have found a possible solution but it is not very general. By the way, if you have a better choice of graph compute engine integrating with Neo4j, please let me know, thanks! — Litchy, May 18 '18 at 02:53

score 0 · Answer 1 · answered May 17 '18 at 03:42

0

the collectEdges in GraphOps API might help.

It collects the neighbor edges of each vertex and can return a VertexRDD[Array[Edge[ED]]] type, which means it changes the return type and collect the messages simultaneously, see API documentation

answered May 17 '18 at 03:42

Litchy

355
1
4
18

How to use combiner in aggregateMessages in GraphX

1 Answers1