0

I'd like to know how to get the whole edge which has the max weight(attr) in a graphx graph? eg:

| srcID | dstID | attr |
------------------------
|  3    |   7   |   2  |
------------------------
|  4    |   7   |   8  |
------------------------
|  4    |   8   |  11  |
------------------------
|  6    |   7   |   4  |

I want the edge 4,8,11 returned since 11 is the max weight

alex9311
  • 1,230
  • 1
  • 18
  • 42
Mayuri M.
  • 121
  • 1
  • 2

1 Answers1

0

Might not be the fastest but this works for me:

import org.apache.spark.rdd._
import org.apache.spark.graphx._
val nodes: RDD[(VertexId, String)] = sc.parallelize(Array((3L, "3"), (7L, "7"), (4L, "4"), (8L, "8"),(6L,"6")))
val vertices: RDD[Edge[Int]] = sc.parallelize(Array(Edge(3L, 7L, 2), Edge(4L, 7L, 8), Edge(4L, 8L, 11), Edge(6L, 7L, 4)))
val graph: Graph[String,Int] = Graph(nodes, vertices, "z")

graph.edges.map(e=>(e.attr,(e.srcId,e.dstId))).max
//res: (Int, (org.apache.spark.graphx.VertexId, org.apache.spark.graphx.VertexId)) = (11,(4,8))

Transform the graph edges so the are in a key-value pair with the attribute (weight) as the key. Not sure how this behaves if multiple edges have the same weight but I'm sure the code can be modified to handle that.

alex9311
  • 1,230
  • 1
  • 18
  • 42