I have a service that successfully monitors Twitters' statuses via Streaming API (the filter
endpoint with the track
parameter). All works well and I receive a lot of tweets with predefined keywords. The only problem is that I do not get my own tweets with these keywords. Is this normal? Should I have a separate special account for the application if it must collect ALL relevant data on Twitter including my messages?
Thanks in advance.
UPDATE
I've found a partial answer here, and I'm posting a part of the Twitter staff explanation below, for reference:
With [node:10389] in particular, you are filtering from the firehose, with a maximum resulting volume of 1% of the total Tweets at that moment... In other words, if the keywords you are tracking account for less than 1% of the firehose, you will receive all the matching Tweets, otherwise you will be capped. To give you an idea, there are more than 500 million Tweets posted every single day on Twitter, so 1% still represents a very large number.
So, tweets we receive via Streaming API are just an arbitrary subset of all tweets which are matching given predicates. BTW, I doubt that my keywords produce 1% data flow of the whole Twitter, but I can't check this out.
Ok, nothing to do here, but then the next question is - how can I determine which part of the firehose I'm getting at every moment in percents? If I'd know this I could change predicates to narrow my query and try to get much more than 1% of default, with improved relevance and data flow coverege.