0

I'm trying to load data from a Postgres table to S3 using Airflow, the PostgresToS3 operator works for all tables except one which is a very large table. The task runs for some time ~800s and then stops without any logs. This seems to because the connection is getting closed after some timeout period. I tried to do a cursor.execute("SET statement_timeout = 0") before cursor.execute(self.sql) Is there any way I can fix this?

https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html

Minato
  • 452
  • 5
  • 19
  • Please, take a look [here](https://stackoverflow.com/questions/59980922/how-to-export-large-data-from-postgres-to-s3-using-cloud-composer). Have you looked for logs? – aga Feb 26 '20 at 08:41
  • No use, I tried checking the StackDriver logs as well, nothing useful there as well. – Minato Feb 26 '20 at 08:43
  • Have you installed all libraries needed to run postgres_hook? – aga Mar 03 '20 at 09:25
  • @muscat Yes, I have several other pipelines using postgres_hook. – Minato Mar 03 '20 at 17:52
  • Probably there are some limitations. How large is your dataset? Have you checked your kubernetes workers, how environment behaves while transporting data? – aga Mar 05 '20 at 15:47

0 Answers0