I have a school project I'm working in a class on web mining where I need to collect a lot of data from certain social media sites. I need data from a large number of individual hashtags on the site. I have a python script that successfully grabs all the data I need for a single hashtag by making sequential HTTP requests until it captures all the records needed for the specified range of time and sales them to a large csv file. I need to run this program a couple thousand times for different hashtags. For some very popular hashtags, the program takes a few hours to run. Many of the hashtags will be much faster though. I wrote a bash script that runs the python program for each hashtag sequentially, but this will take a very long time to collect everything needed.
I wanted to utilize some kind of cloud computing service like google compute engine, AWS, or azure, to run multiple instances of this program separately in parallel so I could collect the data for many of the hashtags at once. Perhaps I could have a large number of cloud machines all running the program for different hashtags at the same time. This is just so I can collect all the data I need faster.
I'm not very experienced with cloud computing outside of a few times I've used google compute engine for simple programs I only needed to run once. I tried reading about instance groups but I'm still not exactly sure how I would use them for this purpose. I'm even less familiar with AWS and Azure offerings.
What's the best way to go about this?