Estimate duration of DynamoDB data export via Data Pipeline

Question

My DynamoDB table has around 100 million (30GB) items and I provisioned it with 10k RCUs. I'm using a data pipeline job to export the data.

The DataPipeline Read Throughput Ratio set to 0.9.

How do I calculate the time for the export to be completed (The pipeline is taking more than 4 hrs to complete the export)

How can I optimize this, so that export completes in less time.

How does the Read Throughput Ratio relate to DynamoDB export?

If you have point in time recovery activated, there is a much easier solution available now, see the [news blog](https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/) — Maurice, Mar 09 '21 at 18:14

score 0 · Answer 1 · answered Mar 09 '21 at 18:25

0

The answer to this question addresses most of your questions in regard to estimating the time for the Data Pipeline job to complete.

There is now a much better solution to export data from DynamoDB to S3, which was announced in November 2020. There is now a way to do that from DynamoDB directly without provisioning an EMR Cluster and tons of RCUs.

Check out the documentation for: Exporting DynamoDB table data to Amazon S3

answered Mar 09 '21 at 18:25

Maurice

11,482
2
25
45

Thank you Maurice, When We use "Export to S3" Option on Dynamodb , there is no option for "Import to Dynamo from S3".. Hence, we chose Data pipeline for Dynamodb data Migration. – Harika K May 05 '22 at 13:00
That's a requirement you may want to mention in your question – Maurice May 05 '22 at 13:53

score 0 · Answer 2 · answered Jul 27 '23 at 09:08

0

You can use the package know as dynoport https://www.npmjs.com/package/dynoport will help you seamlessly import and export data from dynamodb

answered Jul 27 '23 at 09:08

priyanshu kumar

76
1
6

Estimate duration of DynamoDB data export via Data Pipeline

2 Answers2