0

This drawing shows a tree of parent-child relationships. It is directed, without cycles. A child can have multiple parents.

The corresponding array of arrays in Perl is:

(
    [A C],
    [B C],
    [D F G],
    [C E D],
    [E J X I],
    [I J]
)

The first element in each sub-array is the parent of the rest, and the number of sub-arrays is the number of nodes who have at least one child.

Problem

I want to assign a number to each node which tells which level it is on in the graph. The level should also tell whether two nodes are independent, by which I mean they are not in direct parent-child relation. The answer to this specific example should (among many other answers) be:

[A B C D E F G X I J]
[1 1 2 3 3 4 4 4 4 5]

I solution can be implemented in any language, but Perl is preferred.

Still, non of the suggested solutions seems to work for this array:

(
  [ qw( Z A   )],
  [ qw( B D E ) ],
  [ qw( A B C ) ],    
  [ qw( G A E  )],
  [ qw( L B E )]  
)

as does

(
  [ qw/ M A / ],
  [ qw/ N A X / ],
  [ qw/ A B C / ],
  [ qw/ B D E / ],
  [ qw/ C F G / ], 
  [ qw/ F G / ]
  [ qw/ X C / ]
)
Gogi
  • 1,695
  • 4
  • 23
  • 36

3 Answers3

3

The Graph::Directed module will make it simpler to handle this kind of data.

Multiple source nodes makes it potentially more complicated (for instance if there was another edge [Y, X]) but as long as all the sources are at the first level it is workable.

Here is some code that produces the information you say you expect. It assumes all nodes below the top level are accessible from the first source node and measures their path length from there, ignoring the second source.

use strict;
use warnings;

use feature 'say';

use Graph::Directed;

my @data = (
  [ qw/ A C / ],
  [ qw/ B C / ],
  [ qw/ D F G / ],
  [ qw/ C E D / ],
  [ qw/ E J X I / ],
  [ qw/ I J / ],
);

my $graph = Graph->new(directed => 1);

for my $item (@data) {
  my $parent = shift @$item;
  $graph->add_edge($parent, $_) for @$item;
}

my ($source) = $graph->source_vertices;

for my $vertex (sort $graph->vertices) {
  my $path;
  if ($graph->is_source_vertex($vertex)) {
    $path = 0;
  }
  else {
    $path = $graph->path_length($source, $vertex);
  }
  printf "%s - %d\n", $vertex, $path+1;
}

output

A - 1
B - 1
C - 2
D - 3
E - 3
F - 4
G - 4
I - 4
J - 4
X - 4
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • Just one thing; why do you need "sort $graph->vertices"? Thanks for the solution, apart from this one, easy to understand. – Gogi Jun 12 '12 at 18:36
  • the solution doesn't work for the array I just put in the problem section. Any way to find out what the problem may be. I know, it's much more complex array. – Gogi Jun 13 '12 at 16:01
  • `$graph->vertices` returns a list of all the vertices of the graph (`A`, `B`, `C`, `D` etc.) in no particular order. The call to `sort` puts them in alphabetic order just to be tidy and present them in the same order as in your question. – Borodin Jun 13 '12 at 18:06
  • Your problem may be the one I saw with multiple source nodes. Check `$graph->source_vertices` to see how many there are. What problem are you having? – Borodin Jun 13 '12 at 18:07
  • :I just put two arrays in the problem specification, and the solution doesn't work for them. There maybe a solution through first doing $g->topological_sort(), and then use this result and the original \@data array...maybe... – Gogi Jun 13 '12 at 18:59
  • @Moni: those two arrays give satisfactory results as far as I can see. Can you give a better idea what you mean by *level*? Does the data correspond to anything real-world that would help to imagine the problem? – Borodin Jun 14 '12 at 10:51
  • Level of parent should always be lower than all of it's children. In the first graph, E gets level 2 while it comes after B which gets 3 (that happens of course because it's also child of L). In the second graph, F and G gets the same level. Here, G is child of F, so G should get one higher level. Multiple parents one different levels has to taken into account for correct leveling. Hope you understand what I mean. – Gogi Jun 14 '12 at 13:44
1

[This calculates, for each node, the length of the shortest path from a root. But the OP want the length of the longest of the shortest path from each root.]

All you have to do is find the root nodes, then do a breadth-first traversal.

my %graph = map { my ($name, @children) = @$_; $name => \@children } (
    [qw( A C )],
    [qw( B C )],
    [qw( D F G )],
    [qw( C E D )],
    [qw( E J X I )],
    [qw( I J )]
);

my %non_roots = map { $_ => 1 } map @$_, values(%graph);
my @roots = grep !$non_roots{$_}, keys(%graph);

my %results;
my @todo = map [ $_ => 1 ], @roots;
while (@todo) {
   my ($name, $depth) = @{ shift(@todo) };
   next if $results{$name};

   $results{$name} = $depth;
   push @todo, map [ $_ => $depth+1 ], @{ $graph{$name} }
      if $graph{$name};
}

my @names  = sort { $results{$a} <=> $results{$b} || $a cmp $b } keys(%results);
my @depths = @results{@names};
print "@names\n@depths\n";
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Checked on some data, works fine. Thanks. I need to understand it though, haven't used map too much, need to read what it can do. Thanks again. – Gogi Jun 12 '12 at 18:38
  • Going with a module is usually a good idea, so take Borodin's. My point for posting this is that you should have at least tried to answer your own question. There's nothing essential about `map` that couldn't be done with another type of loop; it's just a bit simpler with `map`. – ikegami Jun 12 '12 at 19:17
  • the solution doesn't work for the array I just put in the problem section. Any way to find out what the problem may be. I know, it's much more complex array. – Gogi Jun 13 '12 at 16:02
0

Finally, I think I have solved the problem of finding correct levels, using Borodin's and ikegami's solutions (thanks guys, highly appreiciate your efforts):

#!/usr/local/perl -w 

use strict;
use warnings;
use Graph::Directed;
use List::Util qw( min max );

# my @data = (
# [ qw/ M A/ ],
# [ qw/ N A X/ ],
# [ qw/ A B C / ],
# [ qw/ B D E F/ ],
# [ qw/ C F G / ], 
# [ qw/ F G / ],
# [ qw/ X C G/ ],
# [ qw/ L A B /],
# [ qw/ Q M D/]
# );

# my @data = (
# [ qw( Z A   )],
# [ qw( B D E ) ],
# [ qw( A B C ) ],    
# [ qw( G A E  )],
# [ qw( L B E )]  
# );

# my @data = (
# [ qw/ M A / ],
# [ qw/ N A X / ],
# [ qw/ A B C / ],
# [ qw/ B D E / ],
# [ qw/ C F G / ], 
# [ qw/ F G / ],
# [ qw/ X C / ]
# );

my @data = (
[ qw/ A M B C/ ],
[ qw/ B D F C/ ],
[ qw/ D G/ ],
[ qw/ F G/ ],
[ qw/ C G/ ],
[ qw/ M G/ ],  
);


sub createGraph{
my @data = @{$_[0]};
my $graph = Graph->new(directed => 1);

foreach (@data) {
  my ($parent, @children) = @$_;
  $graph->add_edge($parent, $_) for @children;
}

my @cycleFound = $graph->find_a_cycle;    
print "$_\n" for (@cycleFound);
$graph->is_dag() or die("Graph has cycles - unable to sort\n");
$graph->is_weakly_connected() or die "Graph not weakly connected - unable to analyze\n";  
return $graph;
}

sub getLevels{
my @data = @{$_[0]};
my $graph = createGraph \@data;

my @artifacts = $graph->topological_sort();
chomp @artifacts; 
print "--------------------------\n";
print "Topologically sorted list: \n";
print "$_ " for @artifacts;        
print "\n--------------------------\n";

print "Initial levels (longest path):\n";
my @sources = $graph->source_vertices;
my %max_levels = map { $_=>[]} @artifacts;
my @levels = ();
for my $vertex (@artifacts) {
    my $path = 0;
    foreach(@sources){
        if(defined($graph->path_length($_, $vertex))){
            if ($graph->path_length($_, $vertex) > $path){
                $path = $graph->path_length($_, $vertex)
            }
        }
    }
 printf "%s - %d\n", $vertex, $path;
 push @levels, $path;
 push @{$max_levels{$vertex}}, $path;
}
print "--------------------------\n";

for (my $i = 0; $i < @levels; $i++){ 
my $parent_level = $levels[$i];
my $parent = $artifacts[$i];                
    for (my $j = $i+1; $j < @levels; $j++){ 
        my $child = $artifacts[$j];
        for (@data){
            my ($p, @c) = @{$_};
            if($parent eq $p){
                my @matches = grep(/$child/, @c);
                if(scalar(@matches) != 0){
                    $levels[$j]  = 1 + $parent_level;
                    push @{$max_levels{$child}},$levels[$j];
                    $levels[$j] = max @{$max_levels{$child}};
                }
            }
        }
    }            
}
print "Final levels:\n";
my %sorted = ();
for (my $i = 0; $i < @levels; $i++){
    $sorted{$artifacts[$i]} = $levels[$i];
}
my @orderedList = sort { $sorted{$a} <=> $sorted{$b} } keys %sorted;
print "$sorted{$_} $_\n" for @orderedList;
print "--------------------------\n";   
return  \%max_levels;
}

getLevels \@data;

Output:

    --------------------------
    Topologically sorted list:
    A M B D C F G
    --------------------------
    Initial levels (longest path):
    A - 0
    M - 1
    B - 1
    D - 2
    C - 1
    F - 2
    G - 2
    --------------------------
    Final levels:
    0 A
    1 M
    1 B
    2 F
    2 C
    2 D
    3 G
    --------------------------
Gogi
  • 1,695
  • 4
  • 23
  • 36
  • I'm still looking at your solution, but at first glance it seems to generate a different result for your original case from the one you posted, where I and J were both put on level 4. – Borodin Jun 19 '12 at 07:50
  • @Borodin: That was a mistake. I have corrected the the output I posted. – Gogi Jun 20 '12 at 09:53