1

Withing a hash I have a list of 'jobs', each with an id and a Parent. Jobs with a parent cannot be executed until their parent is. How would I detect a loop of dependencies?

The data set is shown below:

jobs = [
  {:id => 1,  :title => "a",  :pid => nil},
  {:id => 2,  :title => "b",  :pid => 3},
  {:id => 3,  :title => "c",  :pid => 6},
  {:id => 4,  :title => "d",  :pid => 1},
  {:id => 5,  :title => "e",  :pid => nil},
  {:id => 6,  :title => "f",  :pid => 2},
]

The sequence of 'id' is thus: 1 > 2 > 3 > 6 > 2 > 3 > 6.... etc

Cizzle
  • 27
  • 7

2 Answers2

5

This is called "topological sort", and Ruby has it built in. It works a bit more efficiently when parents know their children rather than when children know their parent. Here's the inefficient version; you can speed it up by rewriting your data structure (into a hash that has :children instead of :pid, so that tsort_each_child can just go node[:children].each instead of having to filter the whole array).

Since TSort is designed to work as a mix-in, we need to make a new class for the data (or alternately refine or pollute Array). #tsort will result in a list that is sorted from children to parents; since you want parents before children, we can just #reverse the result.

require 'tsort'

class TSArray < Array
  include TSort
  alias tsort_each_node each
  def tsort_each_child(node)
    each { |child| yield child if child[:pid] == node[:id] }
  end
end

begin
  p TSArray.new(jobs).tsort.reverse
rescue TSort::Cyclic
  puts "Nope."
end
Amadan
  • 191,408
  • 23
  • 240
  • 301
  • Hi, thank you - on experimentation i could remove reverse and still achieve the same results, I'm not sure this would have an detrimental effect? – Cizzle Jun 14 '19 at 14:47
  • If all you want is to detect cyclicity, you don't need `reverse`. If you want to have a parents-before-children sort in case there are no cycles (as you stated in your question), then you need `reverse`, since `tsort` sorts children-before-parents. – Amadan Jun 14 '19 at 22:20
0

The various algorithms for detecting a cycle in a directed graph are designed for arbitrary directed graphs. The graph depicted here is much simpler in that each child has at most parent. That makes it easy to determine if a cycle is present, which can be done very quickly.

I interpreted the question as meaning that, if a cycle were present, you wished to return one, not just the determination of whether one is present.

Code

require 'set'

def cycle_present?(arr)
  kids_to_parent = arr.each_with_object({}) { |g,h| h[g[:id]] = g[:pid] }
  kids = kids_to_parent.keys
  while kids.any?
    kid = kids.first
    visited = [kid].to_set
    loop do
      parent = kids_to_parent[kid]
      break if parent.nil? || !kids.include?(parent)
      return construct_cycle(parent, kids_to_parent) unless visited.add?(parent)
      kid = parent 
    end
    kids -= visited.to_a
  end
  false
end

def construct_cycle(parent, kids_to_parent)
  arr = [parent]
  loop do
    parent = kids_to_parent[parent]
    arr << parent
    break arr if arr.first == parent
  end
end

Examples

cycle_present?(jobs)
  #=> [2, 3, 6, 2]

arr = [{:id=>1, :title=>"a", :pid=>nil},
       {:id=>2, :title=>"b", :pid=>1},
       {:id=>3, :title=>"c", :pid=>1},
       {:id=>4, :title=>"d", :pid=>2},
       {:id=>5, :title=>"e", :pid=>2},
       {:id=>6, :title=>"f", :pid=>3}] 
cycle_present?(arr)
  #=> false

Explanation

Here is the method with comments and puts statements.

def cycle_present?(arr)
  kids_to_parent = arr.each_with_object({}) { |g,h| h[g[:id]] = g[:pid] }
  puts "kids_to_parent = #{kids_to_parent}"                                #!!
  # kids are nodes that may be on a cycle
  kids = kids_to_parent.keys
  puts "kids = #{kids}"                                                    #!!
  while kids.any?
    # select a kid
    kid = kids.first
    puts "\nkid = #{kid}"                                                  #!!
    # construct a set initially containing kid
    visited = [kid].to_set
    puts "visited = #{visited}"                                            #!!
    puts "enter loop do"                                                   #!!

    loop do
      # determine kid's parent, if has one
      parent = kids_to_parent[kid]
      puts "  parent = #{parent}"                                          #!!
      if parent.nil?                                                       #!!
        puts "  parent.nil? = true, so break"                              #!!
      elsif !kids.include?(parent)
        puts "  kids.include?(parent) #=> false, parent has been excluded" #!!
      end                                                                  #!!
      # if the kid has no parent or the parent has already been removed
      # from kids we can break and eliminate all kids in visited
      break if parent.nil? || !kids.include?(parent)
      # try to add parent to set of visited nodes; if can't we have
      # discovered a cycle and are finished
      puts "  visited.add?(parent) = #{!visited.include?(parent)}"         #!! 
      puts "  return construct_cycle(parent, kids_to_parent)" if
        visited.include?(parent)                                           #!!
      return construct_cycle(parent, kids_to_parent) unless visited.add?(parent)
      puts "  now visited = #{visited}"                                    #!!
      # the new kid is the parent of the former kid
      puts "  set kid = #{parent}"                                         #!!
      kid = parent 
    end

    # we found a kid with no parent, or a parent who has already
    # been removed from kids, so remove all visited nodes
    puts "after loop, set kids = #{kids - visited.to_a}"                   #!!
    kids -= visited.to_a
  end
  puts "after while loop, return false"                                    #!!
  false
end

def construct_cycle(parent, kids_to_parent)
  puts
  arr = [parent]
  loop do
    parent = kids_to_parent[parent] 
    puts "arr = #{arr}, parent = #{parent}                                 #!!
    arr << parent
    break arr if arr.first == parent
  end
end

cycle_present?(jobs)

displays the following:

kid = 1
visited = #<Set: {1}>
enter loop do
  parent = 
  parent.nil? = true, so break
after loop, set kids = [2, 3, 4, 5, 6]

kid = 2
visited = #<Set: {2}>
enter loop do
  parent = 3
  visited.add?(parent) = true
  now visited = #<Set: {2, 3}>
  set kid = 3
  parent = 6
  visited.add?(parent) = true
  now visited = #<Set: {2, 3, 6}>
  set kid = 6
  parent = 2
  visited.add?(parent) = false
  return construct_cycle(parent, kids_to_parent)

arr=[2], parent = 3
arr=[2, 3], parent = 6
arr=[2, 3, 6], parent = 2
  #=> [2, 3, 6, 2] 
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100