I am parsing CNN.com to get the top five news storeis with their first paragraph. I have the following code.
url = "http://edition.cnn.com/?refresh=1"
agent = Mechanize.new
page = agent.get("http://edition.cnn.com/?refresh=1")
page.search("//div[@id='cnn_maintt2bul']/div/div/ul/li[count(*)=3]/a").map{|a| page.uri.merge a[:href]}.each do |uri|
article = agent.get(uri).parser
puts article.css(".adtag15090+ p").text
puts "\n"
end
It's not perfect but it works, however, it retrieves all the articles yet I want to retrieve only five articles. Is there a way perhaps using ranges to limit the number of results to five?