3

I have a method that depends on two files for input, and that produces two or more files as output. There is no one-to-one mapping between each input file and each output file - the method performs operations that combines the input data to produce various output files.

Here's a (somewhat contrived) example:

class Util
  def self.perform
    `cat source1.txt source2.txt > target1.txt`
    `cat source2.txt source1.txt > target2.txt`
  end
end

Edit: The actual code in my case is much more complicated than this, and it is not viable to separate the work done to produce each target file into distinct parts. Hence setting up one file task per target file is not really an option in my case.

How do I set up a rake task for running this method once when needed and only then, ie. if one or more of the target files are missing, or if any of the source files are newer than any of the target files?

My current solution looks like this:

task :my_task => ['source1.txt', 'source2.txt'] do |t|
  targets = ['target1.txt', 'target2.txt']
  if targets.all? { |f| File.exist? f }
    mtime_newest_source = t.prerequisites.map {|f| File.mtime(f) }.max
    mtime_oldest_target = targets.map {|f| File.mtime(f) }.min
    # Stop task early if all targets are up to date
    next if mtime_newest_source < mtime_oldest_target
  end
  # Code that actually produces target files goes here
end

What I am looking for is a way to define the task so that none of the code within its block will be run unless the target files need to be rebuilt.

Lars Haugseth
  • 14,721
  • 2
  • 45
  • 49
  • what you mean by rake functionality, do you mean it should generate the files automatically whenever file changes with the help of rake ? Please also include the current solution. – Paritosh Piplewar Jun 06 '14 at 00:16
  • My current solution is a plain `task :my_task => ['source1.txt', 'source2.txt']` which contains code at the start that explicitly checks whether all target files exist and have mtime larger than all the prerequisites, then skips doing any actual work if that is the case. I'd like to know if there is any way to define a Rake task so that no code within its block will be run at all unless any prerequisites changed after any of the target files it produces. – Lars Haugseth Jun 06 '14 at 00:31
  • I updated the question with an example of my current solution. – Lars Haugseth Jun 06 '14 at 00:41
  • I'm still looking for a similar solution. Too bad that rake makes this so difficult. MSBuild for the window platform makes it so easy, and so insanely fast too. – C.J. Nov 25 '15 at 13:01

2 Answers2

4

FileTasks are designed for exactly this. Here's a good old long explanation, and here's a quick example:

file 'target1.txt' => ['source1.txt', 'source2.txt'] do
  # do something to generate target1.txt
  `cat source1.txt source2.txt > target1.txt`
end

file 'target2.txt' => ['source1.txt', 'source2.txt'] do
  # again, generate the file
  `cat source2.txt source1.txt > target2.txt`
end

File tasks can depend on other rake tasks or files on disk. If your tasks depends on a file, rake will compare the timestamps and only run your task if the dependent file is newer.

Update

Here's a working example of generating all of targets from both tasks. Rake is smart enough to skip the second task when the first one creates both files.

def make_the_targets
  `cat source1.txt source2.txt > target1.txt`
  `cat source2.txt source1.txt > target2.txt`
end

file 'target1.txt' => ['source1.txt', 'source2.txt'] do
  puts "making target1.txt"
  make_the_targets
end

file 'target2.txt' => ['source1.txt', 'source2.txt'] do
  puts "making target2.txt"
  make_the_targets
end

task :make_targets => ['target1.txt', 'target2.txt']
James Mason
  • 4,246
  • 1
  • 21
  • 26
  • 1
    This solution only works when the work of producing the distinct target files can be split up into wholly separate parts. This is not the case in my scenario. I will update my question to make this more clear. – Lars Haugseth Jun 06 '14 at 00:45
  • 2
    Maybe have both file tasks call a shared function that creates all of the target files? You'd need to test that to be sure rake notices the new files before running the second task. I'm not sure if rake determines dependencies up front or during execution. – James Mason Jun 06 '14 at 02:03
  • 1
    The update: wait, what - it's that simple? Duh! Thanks! I can even avoid duplication and the method extraction by doing `['target1.txt', 'target2.txt'].each { |target| file target => ['source1.txt', 'source2.txt'] { do_stuff_here } }` – Lars Haugseth Jun 06 '14 at 07:13
0

I don't believe the above answers are correct. The comment from Lars

['target1.txt', 'target2.txt'].each { |target| file target => ['source1.txt', 'source2.txt'] { do_stuff_here } }

Will run { do stuff here } twice if rake 'target1.txt', 'target2.txt' is called.

I believe a correct resolution of the dependencies is achieved by

file 'target1.txt' => ['source1.txt', 'source2.txt'] { do_stuff_here } file 'target2.txt' => 'target1.txt'

Andy
  • 135
  • 1
  • 1
  • 8