11

Have an array of hashes,

my @arr = get_from_somewhere();

the @arr contents (for example) is:

@arr = (
  { id => "id2",    requires => 'someid', text => "another text2" },
  { id => "xid4",   requires => 'id2',    text => "text44" },
  { id => "someid", requires => undef,    text => "some text" },
  { id => "id2",    requires => 'someid', text => "another text2" },
  { id => "aid",    requires => undef,    text => "alone text" },
  { id => "id2",    requires => 'someid', text => "another text2" },
  { id => "xid3",   requires => 'id2',    text => "text33" },
);

need something like:

my $texts = join("\n",  get_ordered_texts(@arr) );

soo need write a sub what return the array of texts from the hashes, - in the dependent order, so from the above example need to get:

"some text",     #someid the id2 depends on it - so need be before id2
"another text2", #id2    the xid3 and xid4 depends on it - and it is depends on someid
"text44",        #xid4   the xid4 and xid3 can be in any order, because nothing depend on them
"text33",        #xid3   but need be bellow id2
"alone text",    #aid    nothing depends on aid and hasn't any dependencies, so this line can be anywhere

as you can see, in the @arr can be some duplicated "lines", ("id2" in the above example), need output only once any id.

Not providing any code example yet, because havent any idea how to start. ;( Exists some CPAN module what can be used to the solution?

Can anybody points me to the right direction?

zostay
  • 3,985
  • 21
  • 30
cajwine
  • 3,100
  • 1
  • 20
  • 41

4 Answers4

13

Using Graph:

use Graph qw( );

my @recs = (
   { id => "id2",    requires => 'someid', text => "another text2" },
   { id => "xid4",   requires => 'id2',    text => "text44" },
   { id => "someid", requires => undef,    text => "some text" },
   { id => "id2",    requires => 'someid', text => "another text2" },
   { id => "aid",    requires => undef,    text => "alone text" },
   { id => "id2",    requires => 'someid', text => "another text2" },
   { id => "xid3",   requires => 'id2',    text => "text33" },
);

sub get_ordered_recs {
   my %recs;
   my $graph = Graph->new();
   for my $rec (@_) {
      my ($id, $requires) = @{$rec}{qw( id requires )};

      $graph->add_vertex($id);
      $graph->add_edge($requires, $id) if $requires;

      $recs{$id} = $rec;
   }

   return map $recs{$_}, $graph->topological_sort();
}

my @texts = map $_->{text}, get_ordered_recs(@recs);
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • @jm666, "In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links." – ikegami Aug 28 '12 at 23:20
4

An interesting problem.

Here's my first round solution:

sub get_ordered_texts {
    my %dep_found;  # track the set of known dependencies
    my @sorted_arr; # output

    my $last_count = scalar @_; # infinite loop protection
    while (@_ > 0) {
        for my $value (@_) {

            # next unless we are ready for this text
            next if defined $value->{requires}
                and not $dep_found{ $value->{requires} };

            # Add to the sorted list
            push @sorted_arr, $value->{text};

            # Remember that we found it
            $dep_found{ $value->{id} }++;
        }

        if (scalar @_ == $last_count) die "some requirements don't exist or there is a dependency loop";
        $last_count = scalar @_;
    }

    return \@sorted_arr;
}

This is not terribly efficient and probably runs in O(n log n) time or something, but if you don't have a huge dataset, it's probably OK.

zostay
  • 3,985
  • 21
  • 30
  • Btw, I always assume my solution to something like this can be improved, but I don't really know. This is just what struck me off the top of my head. – zostay Aug 28 '12 at 20:01
  • Now I need solve similar problem as OP asked. But I'm trying find a solution what is not depending on an BIG CPAN module, like Graph.pm. Testing your solution, but it is unfortunetely not works for the OP's example, and dies on _"some requirements don't exists..."_. Can you please modify the code to prints the needed solution? (so, print all lines in the required order, and the "alone" lines (without deps) can be printed anywhere? Thay would be help me a lot. :) – kobame May 04 '13 at 15:16
2

I would use a directed graph to represent the dependency tree and then walk the graph. I've done something very similiar using Graph.pm

Each of your hashes would be a graph vertex and the edge would represent the dependency.This has the added benefit of supporting more complex dependencies in the future as well as providing shortcut functions for working with the graph.

bot403
  • 2,132
  • 15
  • 14
1
  1. you didn't say what to do of the dependencies are "independent" of each other.

    E.g. id1 requires id2; id3 requires id4; id3 requires id5. What should the order be? (other than 1 before 2 and 3 before both 4/5)

  2. What you want is basically a BFS (Breadth First Search) of a tree (directed graph) of dependencies (or a forest depending on answers to #1 - the forest being a set of non-connected trees).

    To do that:

    • Find all of the root nodes (ids that don't have a requirement themselves)

      You can easily do that by making a hash of ALL the IDs using grep on your data structure

    • Put all those root modes into a starting array.

    • Then implement BFS. If you need help implementing basic BFS using an array and a loop in Perl, ask a separate question. There may be a CPAN module but the algorithm/code is rather trivial (at least once you wrote it once :)

DVK
  • 126,886
  • 32
  • 213
  • 327