2

I have a url that contains many zip files that I need to download local copies of them. I have so far:

require 'open-uri'
require 'pry'

def download_xml(url, dest)
  open(url) do |u|
    File.open(dest, 'wb') { |f| f.write(u.read) }
  end
end

urls = ["http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/"]

urls.each { |url| download_xml(url, url.split('/').last) }

However, I can't seem to access the zip files that are at that location or loop through them. How would I loop through each zip file at the end of that URL so that they can be accessed in that array and downloaded by the method?

Daniel Glover
  • 21
  • 1
  • 2

1 Answers1

1

I have used Nokogiri gem to parse HTML, so first install Nokogiri gem:

sudo apt-get install build-essential patch
sudo apt-get install ruby-dev zlib1g-dev liblzma-dev
sudo gem install nokogiri

Solution that specific to your problem:

noko.rb

require 'rubygems'
require 'nokogiri'
require 'open-uri'

page = Nokogiri::HTML(open("http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/")) # Open web address with Nokogiri
puts page.class   # => Nokogiri::HTML::Documents

for file_link in page.css('a') # For each a HTML tag / link
  if file_link.text[-4,4] != ".zip" # If it's not a zip file
    next # Continue the loop
  end
  link = "http://feed.omgili.com/5Rh5AMTrc4Pv/mainstream/posts/" + file_link.text # Generate the zip file's  link
  puts link
  open(file_link.text, 'wb') do |file|
    file << open(link).read # Save the zip file to this directory
  end
  puts file_link.text + " has been downloaded."
end

I have explained the code with comments.

Eventually, there is no choice but parsing the HTML file and generating download links one by one and download at the end.

mertyildiran
  • 6,477
  • 5
  • 32
  • 55
  • Awesome! I'll try that this evening! Thank you so much! I've read about Nokogiri but didn't have much of a resource. I work in C# MVC and by the time I get home in the evening, no one is available in our Local Ruby Users Group Slack team. Haha But, I really appreciate it - ESPECIALLY the comments! Thanks again, I'll report back this evening. – Daniel Glover Sep 23 '16 at 21:13