We can crawl a hole website with anemone (ex: https://stackoverflow.com/
), but what if I want only focus on a certain folder (ex: https://stackoverflow.com/questions
). How can I do this ? maybe with the "focus_crawl" method ?
Asked
Active
Viewed 550 times
2

Community
- 1
- 1

Ghilas BELHADJ
- 13,412
- 10
- 59
- 99
1 Answers
2
check the keep_if method may be this helps
http://danneu.com/posts/8-scraping-a-blog-with-anemone-ruby-web-crawler-and-mongodb#toc_1
try and pass the pattern as you want to crawl
also there is a gist https://gist.github.com/1149906.
NOTE: I haven't tested it but you can sure surely try.

Pritesh Jain
- 9,106
- 4
- 37
- 51
-
2thank you PriteshJ but I finally found the answer. I've used the method `on_pages_like` instead of `on_every_page` with the pattern like this: `on_pages_like(/http:\/\/stackoverflow.com\/questions\/.)` and it works well. thank you again – Ghilas BELHADJ Aug 08 '12 at 18:03