-1

I have a dag like this (This is a semi-pseudocode), I want to execute the tasks in different branches based on their output.

#This is a method that return a or b
def dosth():
    .....
    return a or b

t1 = PythonOperator(
    't1',
    python_callable = dosth
)

branchA = BashOperator(
    'branchA',....
)

branchB = BashOperator(
    'branchB',....
)

What I want is if dosth returns a, I want the dag to execute the task in branchA, if it returns b,I want the dag to execute the task in branchB. Anyone knows how can we approach this?

yuchennnnn
  • 21
  • 1
  • 2

1 Answers1

2

Check this doc about Branching: https://airflow.apache.org/docs/stable/concepts.html?highlight=branch#branching

You need to use BranchPythonOperator where you can specify the condition to be evaluated to decide which task should be run next.

Example based on your semi-pseudocode:

def dosth():
    if some_condition:
        return 'branchA'
    else:
        return 'branchB'

t1 = BranchPythonOperator(
    task_id='t1',
    provide_context=True,
    python_callable= dosth,
    dag=dag)

branchA = BashOperator(
    'branchA',....
)

branchB = BashOperator(
    'branchB',....
)

The function you pass to python_callable should return the task_id of the next task that should run.

Another Example:

def branch_func(**kwargs):
    ti = kwargs['ti']
    xcom_value = int(ti.xcom_pull(task_ids='start_task'))
    if xcom_value >= 5:
        return 'continue_task'
    else:
        return 'stop_task'

start_op = BashOperator(
    task_id='start_task',
    bash_command="echo 5",
    xcom_push=True,
    dag=dag)

branch_op = BranchPythonOperator(
    task_id='branch_task',
    provide_context=True,
    python_callable=branch_func,
    dag=dag)

continue_op = DummyOperator(task_id='continue_task', dag=dag)
stop_op = DummyOperator(task_id='stop_task', dag=dag)

start_op >> branch_op >> [continue_op, stop_op]
kaxil
  • 17,706
  • 2
  • 59
  • 78