4

I am able to build a graph using a vertexRDD and an edgeRDD via the GraphX API, no problem there. i.e.:

val graph: Graph[(String, Int), Int] = Graph(vertexRDD, edgeRDD)

However, I don't know where to start if I want to use two separate vertexRDD's instead of just one (a bipartite graph). Fore example, a graph containing shopper and product vertices.

My question is broad so I'm not expecting a detailed example, but rather a hint or nudge in the right direction. Any suggestions would be much appreciated.

zero323
  • 322,348
  • 103
  • 959
  • 935
Christopher Mills
  • 711
  • 10
  • 28
  • If your vertices contain the same types, why not `union()` both vertex RDD's and submit that to your graph? – Rohan Aletty Oct 20 '15 at 16:13
  • I am not sure if that's what you're looking for but you can `union()` two RDDs having vertices (just note that you need unique `VertexId`s) and then create edges joining a shopper vertex and a product vertex. If you want, you can also join two graphs (or graph and an RDD of vertices) via their `VertexId`. It's hard to tell what would be the best for you unless you provide more details. – lpiepiora Oct 20 '15 at 18:40

1 Answers1

4

For example to model users and products as a bipartite graph we might do the following:

trait VertexProperty
case class UserProperty(val name: String) extends VertexProperty
case class ProductProperty(val name: String,
  val price: Double) extends VertexProperty

val users: RDD[(VertexId, VertexProperty)] = sc.parallelize(Seq(
  (1L, UserProperty("user1")), (2L, UserProperty("user2"))))

val products: RDD[(VertexId, VertexProperty)] = sc.parallelize(Seq(
  (1001L, ProductProperty("foo", 1.00)), (1002L, ProductProperty("bar", 3.99))))

val vertices = VertexRDD(users ++ products)

// The graph might then have the type:
val graph: Graph[VertexProperty, String] = null
zero323
  • 322,348
  • 103
  • 959
  • 935
eliasah
  • 39,588
  • 11
  • 124
  • 154
  • 1
    Yes it works. I hope you don't mind small changes but this `class VertexProperty()` looked rather strange. I know, official docs, but... – zero323 Oct 20 '15 at 20:24
  • No problem. Please do! – eliasah Oct 20 '15 at 20:26
  • Thanks for the feedback. I have now created two separate VertexRDD's: VRDDShopper: VertexRDD[String] VRDDProduct: VertexRDD[(String,Double)] Now I am assuming that I need to assign them to their respective case classes, UserProperty & ProductProperty. At this point I hit a conceptual wall. I need to somehow get my VertexRDD's into graph[VD, ED] via the class structure that you guys have suggested. How do I do this when the case classes are not defined as VertexRDD's? Again, my apologies for not being specific, but this is something that I am struggling with conceptually. – Christopher Mills Oct 22 '15 at 08:12
  • I'm not sure I understand your question. – eliasah Oct 22 '15 at 08:15
  • Maybe this is a clearer approach to the question: How would I assign my two VertexRDD's to the case classes defined above? – Christopher Mills Oct 22 '15 at 09:38
  • 1
    you can perform a `++` to create a VertexPropery. Correct me @zero323 if I'm mistaken – eliasah Oct 22 '15 at 12:32
  • 1
    More or less. I've edited an answer with example. Please not that products and users are of type `RDD[(VertexId, VertexProperty)]` not `RDD[(VertexId, ProductProperty)]` / `RDD[(VertexId, UserProperty)]` – zero323 Oct 22 '15 at 18:13
  • Thanks Guys! After 4 days of trying I have a bipartite graph. This is awesome stuff. I have one last question if I may; assuming that I have also inserted the edges connecting the vertices (which I have) what would I write to display the products linked to, say using the example above, user1? Otherwise, thank you both for your time and effort. This has really helped me a lot. – Christopher Mills Oct 22 '15 at 19:24
  • @ChristopherMills try to make some research to find how to do that. And we'll help you if you have issue with that. – eliasah Oct 22 '15 at 19:28