0

I am trying to catch the exception which is raised when the connection is reset from the peer during the real-time streaming of tweet, but seems the try-exception block is not properly catching the error raised and pass through it. Please advise, if the block is not rightly placed in the code or there is something wrong with the code.

I have created a script that will stream the tweet in real time to an excel file. Lot of times it has happened that streaming got disconnected due to ECONNRESET error which is connection reset by peer -

Exception in thread Thread-1:
Traceback (most recent call last):
File “/usr/lib/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/usr/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 297, in _run
six.reraise(*exc_info)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 266, in _run
self._read_loop(resp)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 316, in _read_loop
line = buf.read_line().strip()
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 181, in read_line
self._buffer += self._stream.read(self._chunk_size)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 430, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File “/usr/lib/python2.7/contextlib.py”, line 35, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 349, in _error_catcher
raise ProtocolError(‘Connection broken: %r’ % e, e)
ProtocolError: (‘Connection broken: error("(104, ‘ECONNRESET’)",)’, error("(104, ‘ECONNRESET’)",))

Its a protocol error and i tried to catch this error by importing urllib3 library as it has protocol exceptions, but the try and exception block is not able to suppress it and continue with the streaming.

  import pandas as pd
  import csv
  from bs4 import BeautifulSoup
  import re
  import tweepy
  import ast
  from datetime import datetime
  import time
  from tweepy import Stream
  from tweepy import OAuthHandler  
  from tweepy.streaming import StreamListener
  import json
  from unidecode import unidecode
  from urllib3.exceptions import ProtocolError
  from urllib3.exceptions import IncompleteRead
  import requests

  consumer_key= 'xxxxxxxxx'
  consumer_secret= 'xxxxxxxxx'
  access_token= 'xxxxxxxxx'
  access_token_secret= 'xxxxxxxxx'


  with open('TEST_FEB.csv','w')as f:
       f.truncate()
       f.close()

class listener(StreamListener):

    def on_data(self,data):
        data1 = json.loads(data)
        time = data1["created_at"]
        if hasattr(data1,"retweeted_status:"):
            tweet = unidecode(data1["tweet"]["text"])
        if data1["truncated"] == "true":
            tweet = unidecode(data1["extended_tweet"]["full_text"])
        else:
            tweet = unidecode(data1["text"])
        tweet1 = BeautifulSoup(tweet, "lxml").get_text()
        url = "https://twitter.com/{}/status/{}".format(data1["user"] 
               ["screen_name"], data1["id_str"])
        file = open('TEST_FEB.csv', 'a')
        csv_writer = csv.writer(file)
        csv_writer.writerow([time, tweet1, url])
        file.close()

    def on_limit(self, track):
        return True

auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)

while True:
      try:
          twitterStream = Stream(auth, listener(), 
          wait_on_rate_limit=True, retry_count=10, stall_warnings=True)
          twitterStream.filter(track=["abcd"], async = True)

       except ProtocolError as error:
             print (str(error))
             continue

       except IncompleteRead as IR:
              print (str(IR))
              continue

The expected result is that whenever the connection is reset from the peer and the said error is raised, the code should suppress it and continue with the streaming. The code in the current form is not working that way.

Nakul Sharma
  • 143
  • 2
  • 9
  • That error traceback doesn't have any references to the posted code. i.e. I expected to see a reference to the `twitterStream = Stream(auth, listener(), ...` line, and there wasn't any. – John Gordon Apr 24 '19 at 17:06
  • hi @JohnGordon: thanks for your comment. I looked into various ways of handling this error, and found this approach of putting try-exception block near streaming command, hence did the same. Please advise, what else i should do to tackle this problem, where should i place this block. I am not doing much of real-time processing as well on tweets that could slow down the reading of incoming tweets, thereby generating a backlog and hence disrupting the connection. Look forward to hear from you. Thanks. – Nakul Sharma Apr 24 '19 at 18:02
  • As far as I can see, your try/except block is correct. I don't understand why the traceback message doesn't include the actual line of code from your module. – John Gordon Apr 24 '19 at 19:16
  • @JohnGordonL thanks for your input. Same here, it seems to be rightly placed but still not working as expected. – Nakul Sharma Apr 25 '19 at 14:57

0 Answers0