0

I have an Apache Airflow managed environment running in which a number of DAGs are defined and enabled. Some DAGs are scheduled, running on a 15 minute schedule, while others are not scheduled. All the DAGs are single-task DAGs. The DAGs are structured in the following way:

level 2 DAGs -> (triggers) level 1 DAG -> (triggers) level 0 DAG

The scheduled DAGs are the level 2 DAGs, while the level 1 and level 0 DAGs are unscheduled. The level 0 DAG uses ECSOperator to call a pre-defined Elastic Container Service (ECS) task, to call a Python ETL script inside a Docker container defined in the ECS task. The level 2 DAGs wait on the level 1 DAG to complete, which in turns waits on the level 0 DAG to complete. The full Python logs produced by the ETL scripts are visible in the CloudWatch logs from the ECS task runs, while the Airflow task logs only show high-level logging.

The singular tasks in the scheduled DAGs (level 2) have depends_on_past set to False, and I expected that as a result successive scheduled runs of a level 2 DAG would not depend on each other, i.e. that if a particular run failed it would not prevent the next scheduled run from occurring. But what is happening is that Airflow is overriding this and I can clearly see in the UI that a failure of a particular level 2 DAG run is preventing the next run from being selected by the scheduler - the next scheduled run state is being set to None, and I have to manually clear the failed DAG run state before the scheduler can schedule it again.

Why does this happen? As far as I know, there is no Airflow configuration option that should override the task-level setting of False for depends_on_past in the level 2 DAG tasks. Any pointers would be greatly appreciated.

srm
  • 548
  • 1
  • 4
  • 19
  • When you say that Airflow is overriding this, you mean that If you check Task Instance Details from the UI, you get `depends_on_past = True`? In any of your Tasks are you setting `wait_for_downstream = True` ? This may explain changes to `depends_on_past`. – NicoE Oct 12 '21 at 15:26
  • Yes, the single tasks in the level 2 DAGs wait on the task inside the level 1 DAG, which waits on the task in the level 0 DAG. So the level 2 and level 1 DAG tasks have set `wait_for_downstream` to `True`. – srm Oct 13 '21 at 09:21
  • Yesterday I updated the Airflow configuration in the MWAA console to set `catchup_by_default` (or something similar) to `False`, deleted the existing DAGs, and re-uploaded them, and it seems to have solved the problem. But I'll have to keep an eye on it. – srm Oct 13 '21 at 09:24

1 Answers1

1

Answering the question "why is this happening?". I understand that the behavior you are watching is explained by the fact that Tasks are being defined with wait_for_downstream = True. The docs state the following about it:

wait_for_downstream (bool) -- when set to true, an instance of task X will wait for tasks immediately downstream of the previous instance of task X to finish successfully or be skipped before it runs. This is useful if the different instances of a task X alter the same asset, and this asset is used by tasks downstream of task X. Note that depends_on_past is forced to True wherever wait_for_downstream is used. Also note that only tasks immediately downstream of the previous task instance are waited for; the statuses of any tasks further downstream are ignored.

Keep in mind that the term previous instances of task X refers to the task_instance of the last scheduled dag_run, not the upstream Task (in a DAG with a daily schedule, that would be the task_instance from "yesterday").

This also explains why your Task are being executed once you clear the state of the previous DAG Run.

I hope it helps you clarifying things up!

NicoE
  • 4,373
  • 3
  • 18
  • 33
  • OK, that sounds like what is happening. But this is not intuitive for me: I thought `wait_for_downstream` means that if a task A in a DAG D triggers/calls task B in DAG E, then in a particular run of DAG D, A will wait until B completes in the DAG E run. I want the status of task A in DAG D to depend directly on whether task B in DAG E completes. – srm Oct 13 '21 at 14:50
  • I don't want any dependency between successive runs of any DAG. Every DAG run should be independent of previous runs, but because I have DAGs triggering other DAGs then those triggers should relay the status. – srm Oct 13 '21 at 14:51
  • Yes I thought that would be your case, you can handle that "wait until completion of the triggered DAG" while using `TriggerDagRunOperator` defining `wait_for_completion` param. Check the parameters docs [here](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/trigger_dagrun/index.html). – NicoE Oct 13 '21 at 15:15
  • You could also define a list of statuses to be considered as failed or succedeed in the triggered DAG Run. Check this answers for further examples and references to the `TriggerDagRunOperator` [answer1](https://stackoverflow.com/a/67526348/10569220) , [answer2](https://stackoverflow.com/a/67526348/10569220) – NicoE Oct 13 '21 at 15:22
  • Thanks, I'll have a look. But I would note that the name `wait_for_downstream` is not a clear name, especially because `depends_on_past` is indicating dependency between runs of a given DAG. The case of DAG D triggering DAG E means D is upstream of E, and E is downstream of D. So `wait_for_downstream` suggests that if D triggers E then in any given run of D the tasks of D will wait for E to complete. – srm Oct 13 '21 at 16:21
  • I will set `wait_for_downstream` to `False` in the level 2 and level 1 DAGs, and use `wait_for_completion` instead, to see if it works. – srm Oct 13 '21 at 16:22
  • Cool try it out, you will see that it's much cleaner using the TriggerDagRunOP than doing the otherway aorund. Regarding the naming, I guess you are right, It wasn't clear for me at the beginning either. Small disclaimer here, i'm not an Airflow comitter. Consider accepting this specific answer regarding the original question and, if you have further questions go ahead and create new questions, maybe with code samples to make it easier for anyone to help you out. – NicoE Oct 13 '21 at 17:46