I am trying to catch the exception which is raised when the connection is reset from the peer during the real-time streaming of tweet, but seems the try-exception block is not properly catching the error raised and pass through it. Please advise, if the block is not rightly placed in the code or there is something wrong with the code.
I have created a script that will stream the tweet in real time to an excel file. Lot of times it has happened that streaming got disconnected due to ECONNRESET error which is connection reset by peer -
Exception in thread Thread-1:
Traceback (most recent call last):
File “/usr/lib/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/usr/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 297, in _run
six.reraise(*exc_info)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 266, in _run
self._read_loop(resp)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 316, in _read_loop
line = buf.read_line().strip()
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 181, in read_line
self._buffer += self._stream.read(self._chunk_size)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 430, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File “/usr/lib/python2.7/contextlib.py”, line 35, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 349, in _error_catcher
raise ProtocolError(‘Connection broken: %r’ % e, e)
ProtocolError: (‘Connection broken: error("(104, ‘ECONNRESET’)",)’, error("(104, ‘ECONNRESET’)",))
Its a protocol error and i tried to catch this error by importing urllib3 library as it has protocol exceptions, but the try and exception block is not able to suppress it and continue with the streaming.
import pandas as pd
import csv
from bs4 import BeautifulSoup
import re
import tweepy
import ast
from datetime import datetime
import time
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
from unidecode import unidecode
from urllib3.exceptions import ProtocolError
from urllib3.exceptions import IncompleteRead
import requests
consumer_key= 'xxxxxxxxx'
consumer_secret= 'xxxxxxxxx'
access_token= 'xxxxxxxxx'
access_token_secret= 'xxxxxxxxx'
with open('TEST_FEB.csv','w')as f:
f.truncate()
f.close()
class listener(StreamListener):
def on_data(self,data):
data1 = json.loads(data)
time = data1["created_at"]
if hasattr(data1,"retweeted_status:"):
tweet = unidecode(data1["tweet"]["text"])
if data1["truncated"] == "true":
tweet = unidecode(data1["extended_tweet"]["full_text"])
else:
tweet = unidecode(data1["text"])
tweet1 = BeautifulSoup(tweet, "lxml").get_text()
url = "https://twitter.com/{}/status/{}".format(data1["user"]
["screen_name"], data1["id_str"])
file = open('TEST_FEB.csv', 'a')
csv_writer = csv.writer(file)
csv_writer.writerow([time, tweet1, url])
file.close()
def on_limit(self, track):
return True
auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)
while True:
try:
twitterStream = Stream(auth, listener(),
wait_on_rate_limit=True, retry_count=10, stall_warnings=True)
twitterStream.filter(track=["abcd"], async = True)
except ProtocolError as error:
print (str(error))
continue
except IncompleteRead as IR:
print (str(IR))
continue
The expected result is that whenever the connection is reset from the peer and the said error is raised, the code should suppress it and continue with the streaming. The code in the current form is not working that way.