1

I have configured my on premise database server to put a backup everyday at 11 pm into aws s3. The backup is encrypted using s3 standard encryption.

I need to restore this encrypted postgresql backup from s3 to RDS (running postgres) as soon as it reaches the s3 and then query some fields and get the data using a python function which will send this data to list of emails.

My question is: can I use AWS lambda for this task instead of an ec2 instance?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Joseph
  • 307
  • 2
  • 12

1 Answers1

3

In theory, yes you can, but there's some challenges ahead if you go down this path.

To restore the backup from S3 to postgresql: You will have to bundle the pg_restore or psql binaries into your Lambda deployment package. Your Lambda's code will then have to either download the backup S3 file to your Lambda's /tmp folder (watch out for Lambda's limit of 512 MB for the /tmp folder), or stream the S3 file's content to pg_restore/psql via stdin. To invoke psql or pg_restore, you should use Python's subprocess module. Be careful with large backups not to load the whole backup file in memory: you could exceed Lambda's RAM limit.

To query the database, you'll want to include psycopg2 in your Lambda's deployment package. See https://github.com/jkehler/awslambda-psycopg2 for details on how to do that.

To send the data via email, you should look into using AWS SES.

Personally, I would probably use a Docker container to achieve that (using AWS ECS or Batch). This way, it will probably be easier to install the necessary binaries (pg_restore/psql, psycopg2). Also, you will avoid Lambda's inherent limitations (15 minutes execution time limit, max /tmp size, RAM limit).

spg
  • 9,309
  • 4
  • 36
  • 41