403 => Net::HTTPForbidden for https://www.state.gov/countries-areas-archive/tunisia/page/2/ -- unhandled response (Mechanize::ResponseCodeError)
This is what i read in console , i look to scrape a 9-pages US state Department statements about Tunisia. what is wrong?
The code seems correct Ruby Mechanize :
require 'mechanize'
agent = Mechanize.new
9.times do |i|
page = agent.get("https://www.state.gov/countries-areas-archive/tunisia/page/#{i+1}/")
page.search('a.collection-result_link').each do |link|
agent.click(link)
url = agent.page.search('link[rel = "canonical"]').attr('href').text
wrapped_url = url.gsub(url, "<a href='#{url}'>الرابط</a>")
title = agent.page.search('h1.featured-content__headline report-header__headline stars-above').text
statements = [wrapped_url, title]
puts statements
end
end