To find the shortest path between vertices using Spark GraphX, there is the ShortestPaths object, which is member of the org.apache.spark.graphx.lib.
Assuming you have a GraphX graph in g
and you want to find the shortest path between the vertices with ids v1
and v2
, you can do the following:
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib.ShortestPaths
val result = ShortestPaths.run(g, Seq(v2))
val shortestPath = result // result is a graph
.vertices // we get the vertices RDD
.filter({case(vId, _) => vId == v1}) // we filter to get only the shortest path from v1
.first // there's only one value
._2 // the result is a tuple (v1, Map)
.get(v2) // we get its shortest path to v2 as an Option object
The ShortestPaths GraphX algorithm returns a graph where the vertices RDD contains tuples in the format (vertexId, Map(target -> shortestPath)
. This graph will contain all vertices of the original graph, and their shortest paths to all target vertices passed in the Seq
argument of the algorithm.
In your case, you want the shortest path between two specific vertices, so in the code above I show how to call the algorithm with only one target (v2
), and then I filter the result to get only the shortest path starting from the desired vertex (v1
).