How can I extract all href
options in an <a>
tag from a page while reading in a file?
If I have a text file that contains the target URLs:
http://mypage.com/1.html
http://mypage.com/2.html
http://mypage.com/3.html
http://mypage.com/4.html
Here's the code I have:
File.open("myfile.txt", "r") do |f|
f.each_line do |line|
# set the page_url to the current line
page = Nokogiri::HTML(open(line))
links = page.css("a")
puts links[0]["href"]
end
end