0

I am using GraphX for the first time and I want to build a Graph incrementally. So I need to connect the first two nodes to an edge knowing that I have 2 RDDs (each one has a single value):

firstRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]
secondRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]  

I want to connect the first VertexId with the second one. I appreciate your help

fadhloun anis
  • 525
  • 1
  • 6
  • 13
  • Are you saying you have two RDDs, with an equal number of rows. In each row is a VertexId, and you want to pair up one VertexId from the first RDD with one from the second RDD? – David Griffin Jun 13 '15 at 21:23
  • Exactly, that's what i want to do, i want the Graph built when i pair up the two "VertexId" – fadhloun anis Jun 13 '15 at 21:38

1 Answers1

0

Basically, you use map and case statements to pick out the VertexIds, then, use RDD.zip to stitch them together, then another map to create the final EdgeRDD:

firstRDD.map{ 
  case ((junk1,junk2), ((vertex1, junk3), junk4)) => vertex1
}.zip(
  secondRDD.map{
    case ((junk1,junk2), ((vertex2, junk3), junk4)) => vertex2 
  }
).map{ case(vertex1, vertex2) => Edge(vertex1, vertex2, 0) }
David Griffin
  • 13,677
  • 5
  • 47
  • 65