0

I am new in AWS datapipeline and i need to do backup of dynamoDb to S3 bucket and then restore from that backup back to some restored dyanmoDb table and then validate the records,means check number of records in S3 backup and restored dynamoDb table.

can somebody please let me know how to do this ? I know that there is already template in datapipeline to copyrecords from Dynamo to S3 and S3 to dynamo.

But I wanted from some one experienced,the way to do all Backup,Restore and Validate in same datapipeline, mainly restore and validate steps.

Any help would be valuable

Varun
  • 1,159
  • 1
  • 14
  • 19

1 Answers1

1

You can basically just combine the two templates together with one after the other to get your desired pipeline as far as backup and restore goes (if you're not sure how to do that, you can set another activity to execute in the data pipeline after the current one finishes).

As for validation, you could in theory do that as part of a shell command activity, but I'd recommend against it. Shell command activities are very complicated to setup and debug, so you're much better off running some other process to take care of validation.

(I'm serious about shell command activities being difficult to work with. You get very little insight into how they are running, what happens during their run, and the status of the run.)

Gordon Seidoh Worley
  • 7,839
  • 6
  • 45
  • 82
  • Thanks Gordon!!!, I thought the same way one activity after the another just the way you recommended. FYI,I started using the existing template for the backup from dynamoDb to S3 and for restore from S3 to dynamoDb and i am thinking of using the hiveactivity(although i am new to it too)for validation. – Varun Jan 16 '14 at 16:31
  • Hey @G Gordon Worley III,by the way now i have to copy files from one S3 bucket to another S3 bucket,I used the existing copytemplate for that but its giving me java heap error.So now i am thinking of using the EMRActivity for that and http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html but some how i am getting some formatting error.I have mentioned this in steps of EMRActivity " /home/hadoop/lib/emr-s3distcp-1.0.jar,--args '--src,s3://,--dest,s3://' " If by change have you ever worked on this . – Varun Jan 16 '14 at 16:46