Difference between scraper, crawler and spider in the context of Scrapy

Question

Trying to read the code of Scrapy. The words scaper, crawler and spider are confusing. For example

scrapy.core.scraper
scrapy.crawler
scrapy.spiders

Could anyone explain the meanings and differences of these terms in the context of Scrapy? Thanks in advance.

score 12 · Accepted Answer · answered Dec 16 '15 at 15:02

Crawler (scrapy.crawler) is the main entry point to Scrapy API. It provides access to all Scrapy core components, and it's used to hook extensions functionality into Scrapy.

Scraper (scrapy.core.scraper) component is responsible for parsing responses and extracting information from them. It's being run from the Engine, and it's used to run your spiders.

scrapy.spiders is a module containing base Spider implementation (that you use to write your spiders), together with some common spiders available out of the box (like the CrawlSpider for ruleset-based crawling, the SitemapSpider for sitemap based crawling, or XMLFeedSpider for crawling the XML feeds).

More information available on the official documentation pages:
http://doc.scrapy.org/en/latest/topics/spiders.html?highlight=crawlspider#spiders http://doc.scrapy.org/en/latest/topics/api.html?highlight=scrapy.crawler#module-scrapy.crawler

“parsing responses and extracting information”: is that spider? — Frozen Flame, Dec 16 '15 at 16:05

Difference between scraper, crawler and spider in the context of Scrapy

1 Answers1