I have a Scrapy spider and Pipeline setup.
My Spider extracts data from a website and my Pipeline's process_item()
method inserts the extracted data into a temporary database table.
At the end, in the Pipeline's close_spider()
method I run some error checks on the temporary database table and if things look okay, then I make the temporary table permanent.
However, if Scrapy encounters exceptions before the Pipeline's close_spider()
method is called, then its possible that the extracted data is incomplete.
Is there a way to check whether Scrapy has encountered exceptions in the Pipeline's close_spider()
method? In case there are errors (indicating that the extracted data may be incomplete) I do not want to make the temporary table permanent.
I am using the CloseSpider
extension with CLOSESPIDER_ERRORCOUNT
set to 1 to close the Spider on the first error. However, I haven't figured out how to distinguish between a normal close and an error close in the Pipeline's close_spider()
method.