0

I wrote a Python script that will pull data from a 3rd party API and push it into a SQL table I set up in AWS RDS. I want to automate this script so that it runs every night (e.g., the script will only take about a minute to run). I need to find a good place and way to set up this script so that it runs each night.

I could set up an EC2 instance, and a cron job on that instance, and run it from there, but it seems expensive to keep an EC2 instance alive all day for only 1 minute of run-time per night. Would AWS data pipeline work for this purpose? Are there other better alternatives?

(I've seen similar topics discussed when googling around but haven't seen recent answers.)

Thanks

ansonw
  • 1,559
  • 1
  • 16
  • 22

2 Answers2

1

Based on your case, I think you can try to use shellCommandActivity in data pipeline. It will launch a ec2 instance and execute the command you give to data pipeline on your schedule. After finishing the task, pipeline will terminate ec2 instance.

Here is doc:

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-shellcommandactivity.html

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-ec2resource.html

Junren
  • 44
  • 2
  • Thanks, Junren. I saw the AWS Lambda now has python support. Any thought on whether it would be better to go with aws's data pipeline over lambda? – ansonw Oct 26 '15 at 23:19
  • 1
    @ansonw It's possible to do that in Lambda and now it supports cron-like schedule. However since a Lambda function doesn't not run at a fixed IP address and every time it runs, it's most likely under a different address, you need to open your RDS to all IPs. Otherwise you may run RDS in a VPC and I heard Lambda now supports accessing VPC as well, but I haven't tried that setup. – piggybox Oct 27 '15 at 00:38
0

Alternatively, you could use a 3rd-party service like Crono. Crono is a simple REST API to manage time-based jobs programmatically.

gduverger
  • 158
  • 1
  • 10