twisted internet error ConnectionLost while scraping using scrapy framework

Question

I am trying to scrape data from some sites. But after a while the web crawler starts giving twisted internet error ConnectionLost Error. I do not understand the working of twisted. Also,due to this error the web crawlers keep running for ages. Don't know what is causing them to work slow. Please suggest some reasons. My internet connection is fine.

Below is the error :

2014-02-04 14:22:20+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-02/romanians-reject-euro-loans-after-hungary-disaster-mortgages.html> (
failed 1 times): [<twisted.python.failure.Failure <class 'twisted.internet.error
.ConnectionLost'>>]
2014-02-04 14:22:20+0530 [bb] INFO: Crawled 20 pages (at 7 pages/min), scraped 0
items (at 0 items/min)
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-03/u-s-said-to-probe-banks-over-sovereign-wealth-fund-deals.html> (fail
ed 1 times): User timeout caused connection failure: Getting http://www.bloomber
g.com/news/2014-02-03/u-s-said-to-probe-banks-over-sovereign-wealth-fund-deals.h
tml took longer than 180 seconds..
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://search1.bloomberg.com/
search/?content_type=all&page=1&q=ROYAL%20BANK%20OF%20CANADA> (failed 1 times):
User timeout caused connection failure: Getting http://search1.bloomberg.com/sea
rch/?content_type=all&page=1&q=ROYAL%20BANK%20OF%20CANADA took longer than 180 s
econds..
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-03/canada-consumer-sentiment-dips-to-8-month-low-on-currency.html> (fai
led 1 times): User timeout caused connection failure.

Thanks for the help

You can try urllib2 inside a parse instead, [hope this help](http://stackoverflow.com/a/24195788/2297751) — Jon, Jun 13 '14 at 00:06
Did you ever find an answer to this? I'm having the same problem, with just a few sites. — LandonC, May 05 '15 at 19:23

twisted internet error ConnectionLost while scraping using scrapy framework

0 Answers0