0

I am working with Graphx and trying to add features to edges. I have a csv file with Id1, Id2, Weight, Type

I am able to get the Ids and one feature - either weight or type. Is there a way to save multiple features for an edge. Here is a snippet of my code:

val edgesWriterWriterCollaborated: RDD[Edge[String]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, row(2))
}

This gives me an error:

val edgesWriterWriterCollaborated: RDD[Edge[Tuple2]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, (row(2), row(3)))
}

Update:

I fixed my code as so:

    case class WriterWriterProperties(weight: String, edgeType: String)
 val edgesWriterWriterCollaborated: RDD[Edge[WriterWriterProperties]] = sc.textFile(edgeWeightedWriterWriterCollaborated).map {
  line =>
    val row = line.split(",")
    Edge(row(0).toLong, row(1).toLong, WriterWriterProperties(row(2), row(3)))
}

However when I try to print:

   graph4.triplets.foreach(println)

I am getting an error: Caused by: java.io.NotSerializableException

gannina
  • 173
  • 1
  • 8
  • Possible duplicate of [Spark GraphX: add multiple edge weights](https://stackoverflow.com/questions/46680128/spark-graphx-add-multiple-edge-weights) – Shaido Jun 07 '18 at 01:43

1 Answers1

0

Sure. Use a Tuple2 :

Edge(row(0).toLong, row(1).toLong, (row(2), row(3)))

or any domain specific object that makes sense in your case:

case class FooBar(foo: String, bar: String)

Edge(row(0).toLong, row(1).toLong, FooBar(row(2), row(3)))
  • When I try to print it however in this way: graph3.triplets.foreach(println) I only get the first item in the tuple. Am I missing something? – gannina Jun 06 '18 at 22:40