-1

Hi I want to set up Airflow job schedule... where I have 3 tasks i.e. task_1 > tast_2 > task_3.

If first task , task_1 fails I need to stop remaining tasks being executed.
How this can be handled ?
How can I know earlier job failed or succeeded ? how to get task status ?
BdEngineer
  • 2,929
  • 4
  • 49
  • 85
  • 2
    Trigger_rule as all_success is by default for every task in a DAG. If Task1 fails then the DAG fails without running downstream tasks. The status of the job can easily be visible from the UI. – Priya Agarwal Jun 03 '20 at 12:45
  • @Priya Agarwal thank you , any sample for the same please – BdEngineer Jun 03 '20 at 14:09

1 Answers1

2

Please refer to the sample code below:

from airflow import DAG
from airflow.operators.python_operator import PythonOperator, BranchPythonOperator
from airflow.utils.trigger_rule import TriggerRule
import datetime as dt

args = {
    'owner': 'airflow',
    'start_date': '2020-06-02'
}

dag = DAG(
    'testing_trigger_rule',
    schedule_interval="@daily",
    default_args=args
)

def task1():
    print('Running task1')
def task2():
    print('Running task2')
def task3():
    print('Running task3')

Task1 = PythonOperator(
        task_id='task1',
        python_callable=task1,
        trigger_rule=TriggerRule.ALL_SUCCESS,
        dag=dag
    )
Task2 = PythonOperator(
        task_id='task2',
        python_callable=task2,
        trigger_rule=TriggerRule.ALL_SUCCESS,
        dag=dag
    )

Task3 = PythonOperator(
        task_id='task3',
        python_callable=task3,
        trigger_rule=TriggerRule.ALL_SUCCESS,
        dag=dag
    )
Task1 >> Task2 >> Task3

For more information on Trigger rule, please take a look at this link.

Priya Agarwal
  • 462
  • 4
  • 11
  • I am using BashOperator as i need to spark-submit using shell script. So not sure how to use trigger_rule there...i.e. how to capture the status of my spark job ? any clue/guidance please ? – BdEngineer Jun 05 '20 at 04:43
  • hi, you are saying here ALL_SUCCESS, does it mean all three tasks here ? but my requirement is earlier fails its dependent/following task should not start. – BdEngineer Jun 05 '20 at 13:07
  • 1
    ALL_SUCCESS means all the parent tasks are successful. In the defined case, Task3 is dependent on Task2 which in turn is dependent on Task1. So for Task2, ALL_SUCCESS means Task1 was successful and for Task3, ALL_SUCCESS means Task2 and Task1 were successful. – Priya Agarwal Jun 05 '20 at 13:41