0

Our group is working on a sentiment analysis research project. We are trying to use the Twitter API to collect tweets. Out aimed dataset involves a lot of query terms and filters. However, since each of us has a developer account, we were wondering if we can pool API access tokens to accelerate the data collection. For example, we will make an app that allows us to define a configuration file that contains a list of our access tokens that the app will try to use to search for a tweet. This app will be run on our local computer. Since the app uses our individual access tokens, we believe that we are not actually not bypassing or changing any Twitter limit as the record is kept for each access token. Are there any problems legal/technical that may arise from this methodology? Thank you! =D

Here is a pseudocode for what we are trying to do:

1. define a list of search terms such as 'apple', 'banana' 
and 'oranges' (we have 100 of these search terms, we are okay 
with the 100 limit per tweet)

2. define a list of frequent emotional adjectives such as 'happy', 'sad', 'crazy', etc. (we have have 100 of these) using TF-IDF

3. get the product of the search terms and emotional adjectives, 
in total we have 10,000 query terms and we have computed
 through the rate limit rules that we would need at least 
55 runs of 15-minute sessions with 180 tweets per 15-minute. 
 55 * 15 = 825 minutes or ~14 hours to collect this amount of tweets. 

4. we were thinking of improving the data collection by 
pooling access tokens so that we can trim down the time 
of collection from 14 hours to ~4 hours, e.g. by dividing the query items into subsets and letting a specific access token work on a subset  

We were pushing for this since we just think it's efficient if it's possible and permitted since why not and it might help future researches as well?

The question is, are we actually breaking any Twitter rules or policies by doing this? By sharing one access token per each of us three and creating an app that we name as clones of the research project, we believe that in turn we are also losing something which is the headroom for one more app that we fully control.

I can't find specific rule in Twitter so far about this. Our concern is that we will publish a paper and will publish the app we will program and use for documentation and the app we plan to build. Disclaimer: Only the app's source code will be published and not the dataset because of Twitter's explicit rules about datasets.

1 Answers1

1

This is absolutely not allowed under the Twitter Developer Policy and Agreement.

Twitter developer policy 5a:

Do not do any of the following: Use a single application API key for multiple use cases or multiple application API keys for the same use case.

Feel free to check with Twitter directly via the developer forums. StackOverflow is not really the best place for this question since it is not specifically a coding question.

Andy Piper
  • 11,422
  • 2
  • 26
  • 49