0

I'm trying to use Data Pipeline to export data to s3 from Dynamo. However, I can't figure out how to apply client side encryption before the file is written to s3. Is there a way to do this with Data Pipeline? I am able to set up everything except the client side encryption with Data Pipeline. The ideal flow is a dynamo source node, an activity to encrypt, and a S3 destination node.

I also tried Elastic MapReduce, but I don't see how to write a mapper and a reducer since I'm not transforming any data - I just need to move it to an encrypted file on s3. I should be able to use EMR with a hive program, but I am struggling to understand how to use EMR without writing custom map/reduce code. Ideally, no code is stored in S3.

Server side encryption isn't an option and the data needs to be encrypted before being written to s3.

I am looking for some ideas on how to do this or someone who had a similar challenge.

2 Answers2

3

The current Data Pipelines solution doesn't currently support hooks for custom pre or post-processing.

How large is your table? How long is acceptable for the export process to complete?

It should be possible to do this with DynamoDB parallel scan: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#QueryAndScanParallelScan

Essentially you would write a program to use multiple threads to process the scan segments for the parallel scan, perform the encryption, and store the encrypted items in S3. Each DynamoDB scan page should return ~1MB of data, so you could aggregate multiple pages before publishing to S3.

To restore the data, you would load the S3 files, decrypt, and then write back to DynamoDB.

Ben Schwartz
  • 1,716
  • 13
  • 14
0

If this is acceptable for your use case, you can do client-side encryption before writing your data in DynamoDB. You could then use Data Pipelines to export your encrypted data to S3.

I have a similar setup for my application using a client-side encryption library provided by aws-labs. We export the tables daily to keep backups. Restoring the data works as long as the encryption metadata is exported with it.

jrochette
  • 1,117
  • 5
  • 22