I have the following problem. I have a dataframe "vert" in spark, consisting of three columns: Origin (String), Destination (String), Distance (Integer). So, it's simple the data about flights between different cities. For example it could look like this:
Chicago Houston 670
London Chicago 1200
...
I want to create the corresponding graph in GraphX and I want to take the distances as edge attributes to the graph. So first I have to define the edges rdd. I found the following way to do this:
val ed = vert.rdd
.map(x => ((MurmurHash.stringHash(x(0).toString), MurmurHash.stringHash(x(1).toString)), 1))
.reduceByKey(_+_)
.map(x => Edge(x._1._1, x._1._2, x._2))
Unfortunately this command only takes the columns Origin and Destination into account and ignores the column Distance, so that I have no Information about the distances in the rdd "ed". How have I to change the command so that I have also the distances in rdd?
Sorry if it is a stupid question and thanks in advance.