I've got a DAG that's scheduled to run daily. In most scenarios, the scheduler would trigger this job as soon as the execution_date
is complete, i.e., the next day. However, due to upstream delays, I only want to kick off the dag run for the execution_date
three days after execution_date
. In other words, I want to introduce a three day lag.
From the research I've done, one route would be to add a TimeDeltaSensor
at the beginning of my dag run with delta=datetime.timedelta(days=3)
.
However, due to the way the Airflow scheduler is implemented, that's problematic. Under this approach, each of my DAG runs will be active for over three days. My DAG has lots of tasks, and if several DAG runs are active, I've noticed that the scheduler eats up lots of CPU because it's constantly iteration over all these tasks (even inactive tasks). So is there another way to just tell the scheduler to not kick off the DAG run until three days have passed?