1

I'm new to Graphframes and trying to implement edge-betweenness.

I tried using shortest Paths function that is built-in. It returns the distance from the source to the destination vertex but not the actual path between them.

The output is:

| id |   name  | age | distances|
|  g | Gabriel | 33  | [e -> 2] |

Is there any way to get the actual path instead of distance value ?

If anyone could tell me how to implement edge-betweenness efficiently in graph frames that would be really great.

Shubham Yadav
  • 561
  • 7
  • 16

1 Answers1

1

You can try the BFS algorithm as described https://graphframes.github.io/graphframes/docs/_site/user-guide.html#breadth-first-search-bfs. It will give you the list of nodes that have been traversed from the source to the destination.

Another way could be using motifs. You can create patterns and let Graphframes to show all the paths that have a length of whatever is on the motif you have created

For example:

(a)-[e1]->(b); (b)-[e2]->(c)

means show me all the paths with a length of 2 and include vertex names and edges names with their properties. From there you could filter the one that you want or even use anonymous vertices or edges if required.

One note on it is that when you have a lot of data you may find performance problems using motifs or bfs because they are using self joins on their implementations. Please refer to https://www.waitingforcode.com/apache-spark-graphframes/motifs-finding-graphframes/read for more information

Oscar Lopez M.
  • 585
  • 3
  • 11