0

My service stored a lot of files on AWS S3. According to GDPR, I have to implement the possibility to create an archive of these files and store it on S3 as well (for downloading). Also, I have to filter these files.

The best options as for me, it is run the lambda functions (as parts of step function) that filter files, extract the content of files, archive, and upload back to S3.

I looked for something like Amazon Elastic Transcoder. This service transcodes media on the fly. I hoped to find anything for archiving files.

In the worst case, I am going to run the EC2 virtual machine and host there custom service that will create and upload this archive for me. But it isn't the best solution because this functionality will be called time to time and FaaS is the best options for this.

Is it possible to find more elegant solution?

RredCat
  • 5,259
  • 5
  • 60
  • 100
  • What does 'archive' mean in this context? – jarmod Mar 15 '18 at 21:10
  • @jarmod I want to create the archive file and send to user link to it. – RredCat Mar 15 '18 at 22:06
  • OK, so it sounds like an archive is simply a copy of a file, stored in another S3 location. AWS SDKs support a 'copy object' operation that you can use to copy an object. There's no need for you to download the object and re-upload it. You should think carefully about how you intend to provide a link to this archived object to the user. Unless they can authenticate as an AWS user, you need a more sophisticated solution because the standard candidates (publicly-accessible object ir pre-signed URL) won't work for you. – jarmod Mar 15 '18 at 23:20
  • @jarmod, I have already implemented the sophisticated solution for the links on s3. I am creating URL limited by time and restricted by user's IP address. My issue - how to create an archive file in lambda function and don't overhead time and memory limits.. or use any service for this. – RredCat Mar 16 '18 at 07:01
  • Have you tested boto3's s3.copy_object()? Some related solutions here: https://aws.amazon.com/blogs/compute/content-replication-using-aws-lambda-and-amazon-s3/ and https://aws.amazon.com/blogs/compute/synchronizing-amazon-s3-buckets-using-aws-step-functions/ – jarmod Mar 16 '18 at 14:39

0 Answers0