0

The Data Pipeline job runs on a schedule that calls a shell script which ultimately calls pg_dump. I'd like to continue generating the pg_dump backups since they are useful, but move away from using Data Pipeline since I notice that AWS are soon to be removing console access. So I was thinking of trying to do the work in Glue or Lambda calls instead. Any ideas for a good approach to take?

I've so far tried to write a Glue job (pyspark script) that calls pg_dump in a command line string using subprocess.

subprocess.check_call(cmdString)

where cmdString has the pg_dump command in it

But of course that does not work since the job cannot understand where to locate the pg_dump program.

0 Answers0