1

I want to store tweets in CSV, I used tweepy and I managed to store it in CVS but it only extracts data for one day. I want to extract and store data for a week without needing to extract it every day.

This is what I have done:

def tweets_to_data_frame(public_tweets):
    df = pd.DataFrame(data=[tweet.text for tweet in public_tweets], columns=['Tweets'])
    df['len'] = np.array([len(tweet.text) for tweet in public_tweets])
    df['date'] = np.array([tweet.created_at for tweet in public_tweets])
    df['retweets'] = np.array([tweet.retweet_count for tweet in public_tweets])
    df['lang'] = np.array([tweet.lang for tweet in public_tweets])
    return df

public_tweet= api.search('donald trump')
df = tweets_to_data_frame(public_tweet)
df.to_csv('donaldtrump.csv')
df.head(15)
    Tweets  len date    retweets    lang
0   RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:23 67  en
1   RT @errollouis: "If the House ever gets his re...   140 2019-04-09 11:08:23 7927    en
2   RT @BillKristol: "This is what Kirstjen Nielse...   140 2019-04-09 11:08:22 73  en
3   RT @Newsweek: Trump claimed he wouldn't have t...   140 2019-04-09 11:08:21 7   en
4   RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:20 67  en
5   The real reason Donald Trump just fired the he...   112 2019-04-09 11:08:19 0   en
6   RT @BillKristol: "This is what Kirstjen Nielse...   140 2019-04-09 11:08:19 73  en
7   RT @BobbyEberle13: Ilhan Omar is now praying f...   140 2019-04-09 11:08:18 457 en
8   The guy met the queen last time out and lots o...   140 2019-04-09 11:08:17 0   en
9   RT @PalmerReport: Donald Trump’s deconstructio...   135 2019-04-09 11:08:17 107 en
10  RT @ByronYork: Donald Trump has been paying ta...   139 2019-04-09 11:08:16 1232    en
11  RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:16 67  en
12  RT @SayWhenLA:  YUGE !!\n\nPresident Donald J...  140 2019-04-09 11:08:15 1316    en
13  "As long as you're going to be thinking anyway...   100 2019-04-09 11:08:15 0   en
14  RT @TheLastRefuge2: Diana West Discusses The R...   140 2019-04-09 11:08:15 113 en

What I want is the data for one week,

my idea is:

def tweets_to_data_frame1(public_tweets):
    for tweets in tweepy.Cursor(api.search,q = (public_tweets),count=100,
                           since = "2019-04-04",
                           until = "2019-04-07").items():
        df = pd.DataFrame(data=[tweets.text for tweet in tweets], columns=['Tweets'])
        df['len'] = np.array([len(tweets.text) for tweet in tweets])
        df['date'] = np.array([tweets.created_at for tweet in tweets])
        df['retweets'] = np.array([tweets.retweet_count for tweet in tweets])
        df['lang'] = np.array([tweets.lang for tweet in tweets])

        return df

df1 = tweets_to_data_frame1('donald trump')

error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-96745c16c99c> in <module>
----> 1 df1 = tweets_to_data_frame1('donald trump')

<ipython-input-23-e5866a4adb3f> in tweets_to_data_frame1(public_tweets)
      3                            since = "2019-04-04",
      4                            until = "2019-04-07").items():
----> 5         df = pd.DataFrame(data=[tweets.text for tweet in tweets], columns=['Tweets'])
      6 
      7         #df['id'] = np.array([tweet.id for tweet in tweets])

TypeError: 'Status' object is not iterable

expected results:

Tweets  len date    retweets    lang
0   RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:23 67  en
1   RT @errollouis: "If the House ever gets his re...   140 2019-04-09 11:08:23 7927    en
2   RT @BillKristol: "This is what Kirstjen Nielse...   140 2019-04-09 11:08:22 73  en
3   RT @Newsweek: Trump claimed he wouldn't have t...   140 2019-04-09 11:08:21 7   en
4   RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:20 67  en
5   The real reason Donald Trump just fired the he...   112 2019-04-09 11:08:19 0   en
6   RT @BillKristol: "This is what Kirstjen Nielse...   140 2019-04-09 11:08:19 73  en
7   RT @BobbyEberle13: Ilhan Omar is now praying f...   140 2019-04-09 11:08:18 457 en
8   The guy met the queen last time out and lots o...   140 2019-04-09 11:08:17 0   en
9   RT @PalmerReport: Donald Trump’s deconstructio...   135 2019-04-09 11:08:17 107 en
10  RT @ByronYork: Donald Trump has been paying ta...   139 2019-04-09 11:08:16 1232    en
11  RT @mehdirhasan: Stephen Miller’s Jewish uncle...   140 2019-04-09 11:08:16 67  en
12  RT @SayWhenLA:  YUGE !!\n\nPresident Donald J...  140 2019-04-09 11:08:15 1316    en
13  "As long as you're going to be thinking anyway...   100 2019-04-09 11:08:15 0   en
14  RT @TheLastRefuge2: Diana West Discusses The R...   140 2019-04-09 11:08:15 113 en

but for one week

aiman khalid
  • 147
  • 10

1 Answers1

0

So I guess the issue is here:

for tweets in tweepy.Cursor(api.search,q = (public_tweets),count=100,since = "2019-04-04",until = "2019-04-07").items():

tweepy.Cursor(...).items() is a list. So each value of tweets variable is a single tweet. And then you're trying to using list comprehension, so you are trying to iterate over that single tweet. That is exactly what error message told you.

What you could do instead would be something like:

tweets = tweepy.Cursor(...).items()
df = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

BTW also I would rename public_tweets argument of def tweets_to_data_frame1(public_tweets):

public_tweets argument in this case is just a search query string so the name is misleading

running.t
  • 5,329
  • 3
  • 32
  • 50