I want to create a multi threaded web crawler, however after doing some research I've discovered that the gem mechanize
is not multi thread safe. So my question is, is it possible to write a multi thread crawler to scrape multiple search engines at one time? Example:
def site(url)
Nokogiri::HTML(RestClient.get(url))
end
def parse(url, tag, i)
parsing = site(url)
parsing.css(tag)[i]].to_s
end
Thread.new do
agent = Mechanize.new
# do some searching and start the search
parse('google.com', 'html', 0)
end
Thread.new do
agent = Mechanize.new
# same thing and run them in tandem
parse('duckduckgo.com', 'html', 0)
end