1

I'm using Streaming API to track a specific user to get all the tweets and retweets. However, as far as I know there's no way to capture retweets of a retweet as it doesn't come up on the streaming API. For example, I'm tracking user A. User B retweet anything of A's tweets, streaming API will be able to capture that. However, if user C sees anything interesting from B's timeline and click retweet, streaming cannot capture that.

I tried using statuses/retweets API to with the id of the tweet that B retweeted from A's tweet and it comes up as empty. So, I'm wondering if there's anyway I can get retweets of a retweet.

The problem I'm having right now is. Let's say A's tweet gets retweets 5k but streaming API only captures 1K because users retweets directly from A's tweets. However, the rest 4K retweets are from the followers of A which streaming cannot capture that.

Here's my code for streaming API.

#!/usr/bin/env python
#Import the necessary methods from tweepy library
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import json
from pymongo import MongoClient

from sweepy.get_config import get_config

config = get_config()

MONGO_URL = config.get('MONGO_URL')
MONGO_PORT = config.get('MONGO_PORT')
MONGO_USERNAME = config.get('MONGO_USERNAME')
MONGO_PASSWORD = config.get('MONGO_PASSWORD')

connection = MongoClient(MONGO_URL, int(MONGO_PORT))
db = connection['tweets']

  # MongoLab has authentication
db.authenticate(MONGO_USERNAME, MONGO_PASSWORD)

#Variables that contains the user credentials to access Twitter API
consumer_key = config.get('STREAM_TWITTER_CONSUMER_KEY')
consumer_secret = config.get('STREAM_TWITTER_CONSUMER_SECRET')
access_token = config.get('STREAM_TWITTER_ACCESS_TOKEN')
access_token_secret = config.get('STREAM_TWITTER_ACCESS_TOKEN_SECRET')

#This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):

    def on_data(self, data):
        mydata = json.loads(data)
        db.raw_tweets.insert_one(mydata)
        return True

    def on_error(self, status):
        mydata = json.loads(status)
        db.error_tweets.insert_one(mydata)


if __name__ == '__main__':

    #This handles Twitter authetification and the connection to Twitter Streaming API
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    stream = Stream(auth, l)

    #This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby'
    stream.filter(follow=['121817564'])
toy
  • 11,711
  • 24
  • 93
  • 176
  • Did you ever solve this? this is pretty much the exact question I was coming here to ask and I'd like to know if you ever found a way to do what you wanted. – dsollen Jun 25 '20 at 18:35

1 Answers1

0

This is not an answer but it is far too long for a comment...

There is something I haven't understood in your question, maybe it is not quite ok: the point I'm going to make is that if a tweet from A gets 5k retweets, the streaming api could potentially get them all (but in practice you get a sample, depending also on your endpoint, certification status, etc).

Let's see: if B retweets A, could do it in two ways (1) posting a new text and (2) not posting anything, just retweeting.

In case (2), any C retweeting B's tweet would be just as if it was a retweet of A: A's retweet count is updated, and you will get it in the streaming api.

However in case (1), if C is following B as sees the tweet, C can retweet in 2 ways: (1.1) if C just retweets the tweet from B, then A's tweet count will not be updated and the streaming api will not get it, but (1.2) if C clicks on A message and retweets that, then it is just like case (2).

So, if your problem is just about being congruent with the number of retweets of A tweet, the issue will not be retweets of the retweet but the limitations of the streaming api. However, if you would like to get the retweets as in case (1.1), I don't have an answer for that.

Hope it helps.

lrnzcig
  • 3,868
  • 4
  • 36
  • 50