0

I am really new to Ruby and could use some help with a program. I need to open a zip file that contains multiple text files that has many rows of data (eg.)

CDI|3|3|20100515000000|20100515153000|2008|XXXXX4791|0.00|0.00
CDI|3|3|20100515000000|20100515153000|2008|XXXXX5648|0.00|0.00
CHO|3|3|20100515000000|20100515153000|2114|XXXXX3276|0.00|0.00
CHO|3|3|20100515000000|20100515153000|2114|XXXXX4342|0.00|0.00
MITR|3|3|20100515000000|20100515153000|0000|XXXXX7832|0.00|0.00
HR|3|3|20100515000000|20100515153000|1114|XXXXX0238|0.00|0.00

I first need to extract the zip file, read the text files located in the zip file and write only the complete rows that start with (CDI and CHO) to two output files, one for the rows of data starting with CDI and one for the rows of data starting with CHO (basically parsing the file). I have to do it with Ruby and possibly try to set the program to an auto function for arrival of continuous zip files of the same stature. I completely appreciate any advice, direction or help via some sample anyone can give.

Linuxios
  • 34,849
  • 13
  • 91
  • 116

2 Answers2

0

One means is using the ZipFile library.

require 'zip/zip'

# To open the zip file and pass each entry to a block
Zip::ZipFile.foreach(path_to_zip) do |text_file|
   # Read from entry, turn String into Array, and pass to block
   text_file.read.split("\n").each do |line|
      if line.start_with?("CDI") || line.start_with?("CHO")
         # Do something
      end
   end
end
Charles Caldwell
  • 16,649
  • 4
  • 40
  • 47
  • Thank you very much. Is there anything I should adjust if the zip file when opened has 6 separate text files all with the data I need to output with only those certain rows? I really appreciate all of your help thus far. Jay – user1487077 Jun 28 '12 at 16:58
  • No. The `Zip::ZipFile.foreach` actually runs the code for every entry in the zip file. I use that method to iterate over zip file with several thousand entries. – Charles Caldwell Jun 28 '12 at 17:00
  • Note: The above code won't actually extract the zip file. It will go over each entry, read it, and analyze the contents without extracting it. If you need to extract it first, there are method in the library I linked to for that. – Charles Caldwell Jun 28 '12 at 17:02
  • Thank you so much Charles. I will start diving into this, and actually I don't think I need to extract the zip file if your telling me that Ruby can just go through the entire zip file and read and write out my certain data without extracting it first. Hope I understood this correctly. Thank you, Jay – user1487077 Jun 28 '12 at 17:08
  • Charles, I am still having some difficulty. I was trying to implement some of your code for the zip file with the other gentleman's help with the output code and I am stuck. Could you possibly help with a sample for the output files that works with your code for the zip file?? I really appreciate it. Thank you, Jay – user1487077 Jun 29 '12 at 18:34
0

I'm not sure if I entirely follow your question. For starters, if you're looking to unzip files using Ruby, check out this question. Once you've got the file unzipped to a readable format, you can try something along these lines to print to the two separate outputs:

cdi_output = File.open("cdiout.txt", "a")  # Open an output file for CDI
cho_output = File.open("choout.txt", "a")  # Open an output file for CHO

File.open("text.txt", "r") do |f|          # Open the input file
  while line = f.gets                      # Read each line in the input
    cdi_output.puts line if /^CDI/ =~ line # Print if line starts with CDI
    cho_output.puts line if /^CHO/ =~ line # Print if line starts with CHO
  end
end

cdi_output.close                           # Close cdi_output file
cho_output.close                           # Close cho_output file
Community
  • 1
  • 1
KChaloux
  • 3,918
  • 6
  • 37
  • 52
  • Thank you very much, I will try working with both examples, it looks like a great starting point. As far as my question for the zip file. I am wanting to program Ruby to auto unzip the files when a new zip file arrives to me via email with the mentioned data and then proceed with the output steps. Thank you for all your help thus far, Jay – user1487077 Jun 28 '12 at 16:53