0

I need to back up a few DynamoDB tables which are not too big for now to S3. However, these are tables another team uses/works on but not me. These back ups need to happen once a week, and will only be used to restore the DynamoDB tables in disastrous situations (so hopefully never). I saw that there is a way to do this by setting up a data pipeline, which I'm guessing you can schedule to do the job once a week. However, it seems like this would keep the pipeline open and start incurring charges. So I was wondering, if there is a significant cost difference between backing the tables up via the pipeline and keeping it open, or creating something like a powershellscript that will be scheduled to run on an EC2 instance, which already exists, which would manually create a JSON mapping file and update that to S3.
Also, I guess another question is more of a practicality question. How difficult it is to backup dynamoDB tables to Json format. It doesn't seem too hard but wasn't sure. Sorry if these questions are too general.

D. King
  • 33
  • 1
  • 4

2 Answers2

0

Are you are working under the assumption that Data Pipeline keeps the server up forever? That is not the case.

For instance, you have defined a Shell Activity, after the activity completes, the server will terminate. (You may manually set the termination protection. Ref.

Since you only run a pipeline once a week, the costs are not high.

If you run a cron job on ec2 instance, that instance needs to up when you want to run the backup - and that could be a point of failure.

Incidentally, Amazon provides a Datapipeline sample on how to export data from dynamodb.

user1452132
  • 1,758
  • 11
  • 21
  • The EC2 instance I mentioned is already going to be constantly running as it is doing other jobs as well which must continually run. I was just afraid that doing a pipeline job once a week, but keeping that pipeline created but unused would incur a large cost. I guess I could theoretically schedule the data pipeline to be created each week, do the job and then delete it. – D. King Sep 20 '15 at 21:39
  • It incurs $1 per month just for pipeline. https://aws.amazon.com/datapipeline/pricing/ - and the cost of running an EC2 instance during the task execution. If you set the frequency of task to weekly, Datapipeline provisions a resource at that time, runs the task - and when completed terminates the resource. – user1452132 Sep 21 '15 at 17:14
0

I just checked the pipeline cost page, and it says "For example, a pipeline that runs a daily job (a Low Frequency activity) on AWS to replicate an Amazon DynamoDB table to Amazon S3 would cost $0.60 per month". So I think I'm safe.

D. King
  • 33
  • 1
  • 4