1

My DynamoDB table has around 100 million (30GB) items and I provisioned it with 10k RCUs. I'm using a data pipeline job to export the data.

The DataPipeline Read Throughput Ratio set to 0.9.

How do I calculate the time for the export to be completed (The pipeline is taking more than 4 hrs to complete the export)

How can I optimize this, so that export completes in less time.

How does the Read Throughput Ratio relate to DynamoDB export?

Maurice
  • 11,482
  • 2
  • 25
  • 45
Harika K
  • 31
  • 3
  • If you have point in time recovery activated, there is a much easier solution available now, see the [news blog](https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/) – Maurice Mar 09 '21 at 18:14

2 Answers2

0

The answer to this question addresses most of your questions in regard to estimating the time for the Data Pipeline job to complete.

There is now a much better solution to export data from DynamoDB to S3, which was announced in November 2020. There is now a way to do that from DynamoDB directly without provisioning an EMR Cluster and tons of RCUs.

Check out the documentation for: Exporting DynamoDB table data to Amazon S3

Maurice
  • 11,482
  • 2
  • 25
  • 45
  • Thank you Maurice, When We use "Export to S3" Option on Dynamodb , there is no option for "Import to Dynamo from S3".. Hence, we chose Data pipeline for Dynamodb data Migration. – Harika K May 05 '22 at 13:00
  • That's a requirement you may want to mention in your question – Maurice May 05 '22 at 13:53
0

You can use the package know as dynoport https://www.npmjs.com/package/dynoport will help you seamlessly import and export data from dynamodb