1

We are currently using Amazon RDS for our database and we are planning to move to DynamoDB to scale our messaging DB. But a lot of analytics has been setup on MySQL tables which we want to keep using.

What is the best way to sync data from DynamoDB to Redshift ?

chaithu
  • 509
  • 2
  • 7
  • 29
  • you cannot sync, but you can use redshift COPY command. – Jon Scott May 15 '19 at 08:29
  • @JonScott Can we periodically copy data to redshift COPY command, including create,update and delete changes ? – chaithu May 15 '19 at 08:47
  • 1
    I am confused. You say you wish to move from Redshift to DynamoDB, then you mention MySQL tables, then you ask how to sync from DynamoDB to Redshift. Could you please clarify what transfer you are actually asking about? – John Rotenstein May 15 '19 at 08:58
  • 1
    @JohnRotenstein read again - OP says "We are currently using Amazon RDS for our database and we are planning to move to DynamoDB" I assume RDS mysql is current. – Jon Scott May 15 '19 at 08:59
  • 1
    Oops! My bad, thank you. So I think that means the OP is wanting to move from RDS to DynamoDB and they have an existing method of transferring data from RDS to Redshift, and they want to change that to transferring from DynamoDB to Redshift. Is that a correct reading? – John Rotenstein May 15 '19 at 09:19
  • @JohnRotenstein hi john, We are using RDS which is currently syncing to Redshift via Data pipeline. I was wondering is there something for DynamoDB also to sync to redshift periodically ? – chaithu May 15 '19 at 09:19

1 Answers1

1

For this kind of problems (synchronization of source database into Redshift) you have basically two common solutions:

  • (A) create application that periodically polls input database, detects delta and ingest detected delta data into Redshift
  • (B) use AWS Kinesis Firehose

Option B is much easier to implement, as you basically use dedicated AWS service for this purpose. However it is suitable only for source databases supported by Firehose. Luckily for you, DynamoDB is supported as input source for Firehose. Also you need to calculate an extra cost associated with Kinesis itself, Lambda used for record conversions, etc. to be sure that this option is valid for you from economical point of view. See this article to see how you can connect DynamoDB streams with Kinesis using Firehose.

Option A is usually a valid solution for source databases not supported by Firehose.

Henryk Konsek
  • 9,016
  • 5
  • 32
  • 41
  • It's not quite as simple as connecting a DynamoDB instance to Kinesis Firehose. You first need to enable and configure DynamoDB Streams, then create a Lambda function to read off the DynamoDB stream and push it to the Firehose, which will then write the data to Redshift. – Nathan Griffiths May 17 '19 at 01:38
  • 1
    If you want to move data *directly* from DynamoDB to Redshift, you can use the COPY statement (https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-source-dynamodb.html) - however this approach is only able to copy an entire DynamoDB table, not incremental updates. – Nathan Griffiths May 17 '19 at 01:39