How to flatten dependency graph?

Question

I am new with Apache Spark, can i get a snippet of how to implement 'flattening' for dependency graph? i.e lets say I have: nodes :A,B,C edges : (A,B),(B,C)

it would result with a new Graph: nodes:A,B,C edges:(A,B)(A,C)(B,C)

This is non-trivial. There's certainly no out-of-the-box way to do it. What have you tried? — David Griffin, Jun 07 '16 at 19:28
didn't try anything yet, i have just tried to understand other algorithms like most short path so i can customize/modify it, the reason why i looked at that algorithm is because he is recursive too. — David H, Jun 08 '16 at 05:05

score 0 · Answer 1 · answered Jun 07 '16 at 20:34

1) Presuming each node is in its own row

A
B
C

2) Do a CROSS JOIN with self as first step.

A A
A B
A C
B A
B B
B C
C A
C B
C C

2) In second step filter out all the rows where Node name is repeated.

A B
A C
B A
B C
C A
C B

3) Post that derive another field from two fields that would tell you the edge.

A B   AB
A C   AC
B A   BA
B C   BC
C A   CA
C B   CB

You would need to convert this into the (Scala/Python) syntax though. Hope this helps.

How to flatten dependency graph?

1 Answers1