I need to back up a few DynamoDB tables which are not too big for now to S3. However, these are tables another team uses/works on but not me. These back ups need to happen once a week, and will only be used to restore the DynamoDB tables in disastrous situations (so hopefully never).
I saw that there is a way to do this by setting up a data pipeline, which I'm guessing you can schedule to do the job once a week. However, it seems like this would keep the pipeline open and start incurring charges. So I was wondering, if there is a significant cost difference between backing the tables up via the pipeline and keeping it open, or creating something like a powershellscript that will be scheduled to run on an EC2 instance, which already exists, which would manually create a JSON mapping file and update that to S3.
Also, I guess another question is more of a practicality question. How difficult it is to backup dynamoDB tables to Json format. It doesn't seem too hard but wasn't sure. Sorry if these questions are too general.

- 33
- 1
- 4
2 Answers
Are you are working under the assumption that Data Pipeline keeps the server up forever? That is not the case.
For instance, you have defined a Shell Activity, after the activity completes, the server will terminate. (You may manually set the termination protection. Ref.
Since you only run a pipeline once a week, the costs are not high.
If you run a cron job on ec2 instance, that instance needs to up when you want to run the backup - and that could be a point of failure.
Incidentally, Amazon provides a Datapipeline sample on how to export data from dynamodb.

- 1,758
- 11
- 21
-
The EC2 instance I mentioned is already going to be constantly running as it is doing other jobs as well which must continually run. I was just afraid that doing a pipeline job once a week, but keeping that pipeline created but unused would incur a large cost. I guess I could theoretically schedule the data pipeline to be created each week, do the job and then delete it. – D. King Sep 20 '15 at 21:39
-
It incurs $1 per month just for pipeline. https://aws.amazon.com/datapipeline/pricing/ - and the cost of running an EC2 instance during the task execution. If you set the frequency of task to weekly, Datapipeline provisions a resource at that time, runs the task - and when completed terminates the resource. – user1452132 Sep 21 '15 at 17:14
I just checked the pipeline cost page, and it says "For example, a pipeline that runs a daily job (a Low Frequency activity) on AWS to replicate an Amazon DynamoDB table to Amazon S3 would cost $0.60 per month". So I think I'm safe.

- 33
- 1
- 4