0

I am new with Apache Spark, can i get a snippet of how to implement 'flattening' for dependency graph? i.e lets say I have: nodes :A,B,C edges : (A,B),(B,C)

it would result with a new Graph: nodes:A,B,C edges:(A,B)(A,C)(B,C)

David H
  • 1,346
  • 3
  • 16
  • 29
  • This is non-trivial. There's certainly no out-of-the-box way to do it. What have you tried? – David Griffin Jun 07 '16 at 19:28
  • didn't try anything yet, i have just tried to understand other algorithms like most short path so i can customize/modify it, the reason why i looked at that algorithm is because he is recursive too. – David H Jun 08 '16 at 05:05

1 Answers1

0

1) Presuming each node is in its own row

A
B
C

2) Do a CROSS JOIN with self as first step.

A A
A B
A C
B A
B B
B C
C A
C B
C C

2) In second step filter out all the rows where Node name is repeated.

A B
A C
B A
B C
C A
C B

3) Post that derive another field from two fields that would tell you the edge.

A B   AB
A C   AC
B A   BA
B C   BC
C A   CA
C B   CB

You would need to convert this into the (Scala/Python) syntax though. Hope this helps.

Amit
  • 1,111
  • 1
  • 8
  • 14