I have set up Airflow 1.10.10 with Celery as Executor, Postgres as result backend and sql alchemy connection, and Redis as broker/message queue. I'm using one pod for each Airflow component (scheduler, webserver, broker and 1 worker) with 2 GiB of memory and 2 cores of CPU. My Postgres instance is running in Azure with 2 CPU cores. The main issue is that whenever I start scheduling some of the example DAGs, the CPU resource of Postgres will hit ~95% and the tasks will start to fail, cause of connection issues (like PID timeouts in the Scheduler or the "FATAL: remaining connection slots are reserved for non-replication superuser connections" error) I've tried changing some of the pool parameters from sql alchemy in the airflow.cfg but still getting the issue. My question would be: is a Postgres DB running in Azure, 2 CPU cores good enough for handling DAGS? What would be an appropiate set up? Or how can prevent Airflow of congesting Postgres? Thanks!
Asked
Active
Viewed 329 times