I have a dag with one task, and I want it to get triggered only once a day. My problem is that it gets triggered multiple times when the time comes. So the daily task is run 4 times instead of once. I set a number of configurations to fix that including:
'retries': 1
catchup=False, max_active_runs=1
I also increased the time between retires thinking maybe airflow thinks the task has failed/not started since it might take some time for task to finish.
I also moved all the code that is supposed to run in that dag to utils folder based on this answer
But I don't know what am I missing here. Can anyone please help? Thank you in advance.
Here is the dag
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from utils.postgres import postgres_backup_to_s3
default_args = {
'retries': 1,
'retry_delay': timedelta(minutes=30),#getting backup and uploading to s3 might take some time
'start_date': datetime(2021, 1, 1)
}
with DAG('postgres_backup', default_args=default_args, schedule_interval='0 19 * * * *',
catchup=False, max_active_runs=1) as dag:
postgres_backup_to_s3_task = PythonOperator(task_id="postgres_backup_to_s3", python_callable=postgres_backup_to_s3)