I am trying to scrape data from some sites. But after a while the web crawler starts giving twisted internet error ConnectionLost Error. I do not understand the working of twisted. Also,due to this error the web crawlers keep running for ages. Don't know what is causing them to work slow. Please suggest some reasons. My internet connection is fine.
Below is the error :
2014-02-04 14:22:20+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-02/romanians-reject-euro-loans-after-hungary-disaster-mortgages.html> (
failed 1 times): [<twisted.python.failure.Failure <class 'twisted.internet.error
.ConnectionLost'>>]
2014-02-04 14:22:20+0530 [bb] INFO: Crawled 20 pages (at 7 pages/min), scraped 0
items (at 0 items/min)
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-03/u-s-said-to-probe-banks-over-sovereign-wealth-fund-deals.html> (fail
ed 1 times): User timeout caused connection failure: Getting http://www.bloomber
g.com/news/2014-02-03/u-s-said-to-probe-banks-over-sovereign-wealth-fund-deals.h
tml took longer than 180 seconds..
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://search1.bloomberg.com/
search/?content_type=all&page=1&q=ROYAL%20BANK%20OF%20CANADA> (failed 1 times):
User timeout caused connection failure: Getting http://search1.bloomberg.com/sea
rch/?content_type=all&page=1&q=ROYAL%20BANK%20OF%20CANADA took longer than 180 s
econds..
2014-02-04 14:22:57+0530 [bb] DEBUG: Retrying <GET http://www.bloomberg.com/news
/2014-02-03/canada-consumer-sentiment-dips-to-8-month-low-on-currency.html> (fai
led 1 times): User timeout caused connection failure.
Thanks for the help