5

How do I get the output of an external command and extract values from it?

I have something like this:

stdin, stdout, stderr, wait_thr = Open3.popen3("#{path}/foobar", configfile)

if /exit 0/ =~ wait_thr.value.to_s
    runlog.puts("Foobar exited normally.\n")
    puts "Test completed."
    someoutputvalue = stdout.read("TX.*\s+(\d+)\s+")
    puts "Output value: " + someoutputvalue
end

I'm not using the right method on stdout since Ruby tells me it can't convert String into Integer.

So for instance, if the output is

"TX So and so:     28"

I would like to get only "28". I validated that the regex above matches what I need to match, I'm only wondering how to store that extracted value in a variable.

What is the right way of doing this? I can't find anywhere in the documentation the methods available for stdout. I'm using stout.read from Ruby 1.9.3.

Emma
  • 27,428
  • 11
  • 44
  • 69
Astaar
  • 5,858
  • 8
  • 40
  • 57
  • If you don't want to communicate to the process and just read its output, `capture3` would be more appropriate (and more simple). – undur_gongor Jun 12 '13 at 13:47
  • What's the argument to read do? I thought the only argument you passed was how far into the stream to go? Maybe it's trying to treat your string argument as the integer number of bytes? – Joe Pym Jun 12 '13 at 13:52
  • @JoePym the argument is an XML file used by foobar to run, if that's your question. – Astaar Jun 12 '13 at 13:54
  • I think that's the problem: Open3.popen3("grep") do |stin,stdout,stderr| stdout.read("aaaa") end raises the same error. Your argument to read in a stream case is the number of bytes to seek into the stream. – Joe Pym Jun 12 '13 at 13:57
  • Is there anyway to apply a regex to the stream and instead of seeking to bytes? I guess that's the core question due to my lack of knowledge on Ruby methods :) – Astaar Jun 12 '13 at 14:14
  • `capture3` is the simple way of executing something if you don't need to pass anything in via STDIN. – the Tin Man Jun 12 '13 at 14:21

1 Answers1

15

All the information needed is in the Popen3 documentation, but you have to read it all and look at the examples pretty carefully. You can also glean useful information from the Process docs too.

Maybe this will 'splain it better:

require 'open3'

captured_stdout = ''
captured_stderr = ''
exit_status = Open3.popen3(ENV, 'date') {|stdin, stdout, stderr, wait_thr|
  pid = wait_thr.pid # pid of the started process.
  stdin.close
  captured_stdout = stdout.read
  captured_stderr = stderr.read
  wait_thr.value # Process::Status object returned.
}

puts "STDOUT: " + captured_stdout
puts "STDERR: " + captured_stderr
puts "EXIT STATUS: " + (exit_status.success? ? 'succeeded' : 'failed')

Running that outputs:

STDOUT: Wed Jun 12 07:07:12 MST 2013
STDERR:
EXIT STATUS: succeeded

Things to note:

  • You often have to close the stdin stream. If the called application expects input on STDIN it will hang until it sees the stream close, then will continue its processing.
  • stdin, stdout, stderr are IO handles, so you have to read the IO class documentation to find out what methods are available.
  • You have to output to stdin using puts, print or write, and read or gets from stdout and stderr.
  • exit_status isn't a string, it's an instance of the Process::Status class. You can mess with trying to parse from its to_s version, but don't. Instead use the accessors to see what it returned.
  • I passed in the ENV hash, so the child program had access to the entire environment the parent saw. It's not necessary to do that; Instead you can create a reduced environment for the child if you don't want it to have access to everything, or you can mess with its view of the environment by changing values.
  • The code stdout.read("TX.*\s+(\d+)\s+") posted in the question is, um... nonsense. I have no idea where you got that as nothing like that is documented in Ruby's IO class for IO#read or IO.read.

It's easier to use capture3 if you don't need to write to STDIN of the called code:

require 'open3'

stdout, stderr, exit_status = Open3.capture3('date')

puts "STDOUT: " + stdout
puts "STDERR: " + stderr
puts "EXIT STATUS: " + (exit_status.success? ? 'succeeded' : 'failed')

Which outputs:

STDOUT: Wed Jun 12 07:23:23 MST 2013
STDERR:
EXIT STATUS: succeeded

Extracting a value from a string using a regular expression is trivial, and well covered by the Regexp documentation. Starting from the last code example:

stdout[/^\w+ (\w+ \d+) .+ (\d+)$/]
puts "Today is: " + [$1, $2].join(' ')

Which outputs:

Today is: Jun 12 2013

That's using the String.[] method which is extremely flexible.

An alternate is using "named captures":

/^\w+ (?<mon_day>\w+ \d+) .+ (?<year>\d+)$/ =~ stdout
puts "Today is: #{ mon_day } #{ year }"

which outputs the same thing. The downside to named captures is they're slower for what I consider a minor bit of convenience.


"TX So and so: 28"[/\d+$/]
=> "28"
Community
  • 1
  • 1
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • But if I just `puts stdout.read` it prints correctly, why is the use of an intermediate variable necessary (in my case the regex is done in the block so there's no scope issue) ? How would I extract a value using a regex on captured_stdout in your example? – Astaar Jun 12 '13 at 14:13
  • I stored it in an intermediate variable to show you how to access that from outside the block. If you don't need to do that then don't. And, you'd use a normal regular expression or a substring to extract a value from `captured_stdout`, there's nothing mysterious or magical, it's a string containing whatever the called code returns. – the Tin Man Jun 12 '13 at 14:18
  • I have read the documentation closely, my question is more general around Ruby, basically asking how to store a value extracted from a string through a regular expression, which is not covered by the Popen3 documentation.. nor the Regexp documentation as far as I can tell. The example you gave is already working for me, since I have read the documentation. – Astaar Jun 12 '13 at 14:30
  • Then your question is a red-herring. If you want to know how to use Regular expressions reduce your question to just that. I'll add an example. – the Tin Man Jun 12 '13 at 14:33