How to stop Scrapy crawler from executing immediately?

Question

I am using Scrapy framework for crawling some websites. I want to stop crawling immediately after a flag I decide. In my pipeline I stop the crawler like this:

spider.crawler.engine.close_spider(self, reason='My reason')

It stops when I want but it doesn't stop executing the code until it sends requests on the urls remaining in the connectionpool and I don't want that. How can I stop it immediately, is there a way to clear the urls from the connectionpool?

Thank you in advance.

score 1 · Accepted Answer · edited Dec 24 '20 at 12:49

1

According to scrapy docs close_spider stop scheduling new requests and it does not stop crawling process immediately. In your case close_spider worked exactly as documented.
The only way I know to stop crawling immediately is to use os.exit like on this answer.

edited Dec 24 '20 at 12:49

Wai Ha Lee

8,598
83
57
92

answered Dec 23 '20 at 21:11

Georgiy

3,158
1
6
18

How to stop Scrapy crawler from executing immediately?

1 Answers1