1

I have been looking for the GraphX on Spark documentation and I am trying to work out how to calculate all the 2 and potentially further step connections in the graph.

If I have the following structure

  • A -> b
  • b -> C
  • b -> D

Then A is connected to C and D via B (A -> b -> C) and (A -> b -> D)

I was having a look at the connected components functions but not sure how you would extend it to this. In reality b will be a different vertex type but not sure if this has an effect or not.

Any suggestions would be greatly appreciated I am pretty new to GraphX

SChorlton
  • 123
  • 2
  • 7

1 Answers1

0

It seems you just need to use collectNeighborIds action, and then join with reversed copy of itself. I wrote some code:

val graph : Graph[Int, Int] = ...
val bros = graph.collectNeighborIds(EdgeDirection.Out)
val flat = bros.flatMap(x => x._2.map(y => (y, x._1)))
val brosofbros : RDD[(VertexId, Array[VertexId])]= flat.join(bros)
.map(x => (x._2._1, x._2._2))
.reduceByKey(_ ++ _)

Finally 'brosofbros' contains vertex id and all its second neighbors, in you example it would be [A, Array[C, D]]. (but there is not B vertex)

Hlib
  • 2,944
  • 6
  • 29
  • 33