2

I want my running spider to be closed immediately without processing any scheduled requests, I've tried the following approaches to no avail,

  • Raising CloseSpider in callback functions
  • Calling spider.crawler.engine.close_spider(spider, 'reason') in a downloader middleware

The script is automated as it runs several spiders in a loop. I want the running spider to be closed instantly when it meets a certain condition and the program to be continued with the rest of the spiders inside the loop.

Is there a way to drop the requests from the scheduler queue?

I have included a snippet where i'm trying to terminate the spider


class TooManyRequestsMiddleware:
    def process_response(self, request, response, spider):
        if response.status == 429:
            spider.crawler.engine.close_spider(
                spider,
                f"Too many requests!, response status code: {response.status}
            )
        elif 'change_spotted' in list(spider.kwargs.keys()):
            print("Attempting to close down the spider")
            spider.crawler.engine.close_spider(spider, "Spider is terminated!")
        return response

naru22to
  • 31
  • 2
  • 1
    Can you share this part of the code? – SuperUser Aug 11 '21 at 10:11
  • @SuperUser shared my downloader middleware snippet – naru22to Aug 11 '21 at 12:02
  • CloseSpider should work so I tried it with your code and it was ok, did you remember to add the middleware to your settings? – SuperUser Aug 11 '21 at 12:26
  • yeah it works. though it takes time to close the spider, in the meantime few requests from scheduled spider will still go through the pipeline. i want the spider to be closed immediately as it meets the given condition. thanks! – naru22to Aug 11 '21 at 12:43
  • @naru22to . This issue can be solved by usage of [`os._exit(0)`](https://stackoverflow.com/a/55877309/10884791) – Georgiy Aug 20 '21 at 16:50
  • Does this answer your question? [Unable to make my script stop when some urls are scraped](https://stackoverflow.com/questions/55792062/unable-to-make-my-script-stop-when-some-urls-are-scraped) – Georgiy Aug 20 '21 at 16:52

0 Answers0