I am scraping a small number of sites with the ruby anemone gem.
Anemone.crawl("http://www.somesite.com") do |anemone|
anemone.on_every_page do |page|
...
end
end
Depending on the site, some require 'www' to be present in the url while others require that it be omitted. How can I configure the crawler or code it so that it known when to use the correct url?