2

In GraphX aggregateMessages API

class Graph[VD, ED] {
  def aggregateMessages[Msg: ClassTag](
      sendMsg: EdgeContext[VD, ED, Msg] => Unit,
      mergeMsg: (Msg, Msg) => Msg,
      tripletFields: TripletFields = TripletFields.All)
    : VertexRDD[Msg]
}

However I want to modify the return type of merge stage which means I want something like combineByKey instead of reduceByKey, how can I do it based on the advantage of GraphX? Or in other words, how can I just use the result of sendMsg and skip the mergeMsg stage of this function?

The advantage of GraphX I mean is "vertex centric", if I use map and combineByKey function it will do global shuffling, which cost plenty of time, going against the idea "vertex centric"

Litchy
  • 355
  • 1
  • 4
  • 18
  • I have found a possible solution but it is not very general. By the way, if you have a better choice of graph compute engine integrating with Neo4j, please let me know, thanks! – Litchy May 18 '18 at 02:53
  • I have put it in the answer below – Litchy May 18 '18 at 02:53

1 Answers1

0

the collectEdges in GraphOps API might help.

It collects the neighbor edges of each vertex and can return a VertexRDD[Array[Edge[ED]]] type, which means it changes the return type and collect the messages simultaneously, see API documentation

Litchy
  • 355
  • 1
  • 4
  • 18