-1

Is it possible to scrape the products from a ecommerce site using the anemone and nokogiri libs in ruby?

I understand how to pull the data I need from each product page using nokogiri but I can't figure out how to make anemone/nokogiri crawl the site and grab all the product pages.

A push in the right direction would be much appreciated

Dan
  • 641
  • 9
  • 25
  • 1
    I've never had luck getting anemone to work right. I've tried it a few times but gave up and used mechanize each time instead. – pguardiario May 20 '12 at 08:32
  • 1
    [What have you tried?](http://mattgemmell.com/2008/12/08/what-have-you-tried/) What is your code? What is your question? – Phrogz May 21 '12 at 04:39
  • http://stackoverflow.com/questions/10679058/ruby-scraper-how-to-export-to-csv – Dan May 21 '12 at 05:05

1 Answers1

0

I figured out my issues. First was that anemone didn't seem to be crawling all the pages. This was because the pages I wanted were under a subdomain that I had to tell anemone to crawl separately from the main domain. Second was I needed a way to determine which pages were actually product pages (and thus neede to be parsed). I did this by parsing one of the fields I wanted (sku number) and then testing if it was a sku with RegEX.

Dan
  • 641
  • 9
  • 25