1

I have a web scraper python script that runs a list of functions every 1-2 hours from a raspberrypi. I want to be able to run X number of instances simultaneously with specific parameters (like credentials, configurations, proxy details, etc). I've looked into task queues, serverless functions, VPS', and things like Google Cloud Run, but I'm just not sure which would be the best option for something like this.

The functions are dependent on each other so they must run in a specific order, but some of them make many requests and must run asynchronously. Each cycle could take a minute or up to 30+ mins depending on how much new data there is to grab (+ taking API rate limiting into account).

Here is a sample list of the functions

  1. Get website information updates
  2. Get a list of new/updated posts
  3. Gather information from each new/updated post (async)
aerox
  • 11
  • 1

0 Answers0