0

I am using tweepy and python to gather tweets based on certain keywords and then writing those status updates (tweets) to a CSV file. I do not consider myself a programmer and I am really lost on this.

Here is the Error:

> Traceback (most recent call last):
  File "./combined-tweepy.py", line 58, in <module>
    sapi.filter(track=[topics])
  File "/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py", line 286, in filter
    encoded_track = [s.encode(encoding) for s in track]
AttributeError: 'tuple' object has no attribute 'encode'

Here is the script:

#!/usr/bin/python
import sys
import re
import tweepy
import codecs
import datetime

consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

# Create a list of topics
with open('termList.txt', 'r') as f:
  topics = [line.strip() for line in f]

stamp = datetime.datetime.now().strftime('%Y-%m-%d-%H%M%S')
topicFile = open(stamp + '.csv', 'w+')
sapi = tweepy.streaming.Stream(auth, CustomStreamListener(topicFile))
sapi.filter(track=[topics])

class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, output_file, api=None):
        super(CustomStreamListener, self).__init__()
        self.num_tweets = 0
        self.output_file = output_file

    def on_status(self, status):
        ### Writes one tweet per line in the CSV file
        cleaned = status.text.replace('\'','').replace('&amp;','').replace('&gt;','').replace(',','').replace("\n",'')
        self.num_tweets = self.num_tweets + 1
        if self.num_tweets < 500:
            self.output_file.write(status.user.location.encode("UTF-8") + ',' + cleaned.encode("UTF-8") + "\n")
            print ("capturing tweet from list")
            # print status.user.location
            return True
        else:
            return False
            sys.exit("terminating")

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True #Don't kill the stream

f.close()
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98
RoninUTA
  • 43
  • 7
  • does 'termList.txt' has something called encode in it? – Srivatsan May 25 '14 at 20:08
  • I am not sure how to put this in a list format in a comment: BlackStone ViceLords Piru Crips Barrio Azteca FBD 624 BDS MLD Nortenos Tangos Vallucos Orejas Foritos Houstone Surenos Trinitarios Armanian Assyrian Nuestra Syndicate Hammerskins Lowriders Volksfront Capirucha Corpitos Tangos Mandingo Pocos Tongs Salvatrucha MS-13 Sureno 915 one of the topics is 915 and 624, area codes representing a gang. – RoninUTA May 26 '14 at 22:42

1 Answers1

1

Here's the definition of a tuple according to Python's documentation. It seems like one of the words in topics is a tuple.

I see other little errors. First, the way you wrote your code, you should call your functions after you have defined them. For example, these two lines

sapi = tweepy.streaming.Stream(auth, CustomStreamListener(topicFile))
sapi.filter(track=[topics])

should come after you have defined all the functions in

class CustomStreamListener(tweepy.StreamListener):

Also, there's no need to put topics in braces

sapi.filter(track=[topics])

since it's already a list according to this line

topics = [line.strip() for line in f]

Can you show us the content of termList.txt?

blue_chip
  • 666
  • 2
  • 6
  • 22
  • I put this in the comment above, one entry per line in the txt file. – RoninUTA May 26 '14 at 23:01
  • I don't see a problem with the way your list of topics is constructed. Try correcting the little errors that I've found in your code before. Maybe, it'll help. For the moment, I really don't know what else to do. – blue_chip May 27 '14 at 03:48
  • Thanks blue_chip I will see where this leads me. – RoninUTA May 27 '14 at 14:27