0

Certain websites require us to have a particular IP address to display certain information eg. ads for country X. I would like to know if it is possible to use a proxy (preferably ruby one) with my ruby script @scraperwiki to get the results as if I was in that country X. Right now the script gets the results in the UK and if I use an HTTP proxy I can see the website that I want to retrieve the data from correctly. The problem is Scraperwiki does not return the webpage like if it was in country X

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
Pedro Pereira
  • 480
  • 5
  • 12
  • I would like an alternative to using a webbased proxy because these are too slow. Instead of doing `doc = Nokogiri::HTML(open(queryurl)) would do doc = Nokogiri::HTML(open(http://webproxycountryX.xx? website=queryurl))` – Pedro Pereira Feb 16 '13 at 14:55
  • 2
    Note that [tag:web-scraping] is usually not considered to be data mining. The term data mining is (properly) used for advanced statistical data analysis, not the collection of data. Please use the more appropriate tags, this will get you better answers. – Has QUIT--Anony-Mousse Feb 16 '13 at 15:01

1 Answers1

2

Yes. You should be using Mechanize:

require 'mechanize'
agent = Mechanize.new
agent.set_proxy host, port
page = agent.get url

Now call page#search or page#at just like you would with your Nokogiri document.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
pguardiario
  • 53,827
  • 19
  • 119
  • 159