41

I was going through Apache Airflow tutorial https://github.com/hgrif/airflow-tutorial and encountered this section for defining task dependencies.

with DAG('airflow_tutorial_v01',
     default_args=default_args,
     schedule_interval='0 * * * *',
     ) as dag:

print_hello = BashOperator(task_id='print_hello',
                           bash_command='echo "hello"')
sleep = BashOperator(task_id='sleep',
                     bash_command='sleep 5')
print_world = PythonOperator(task_id='print_world',
                             python_callable=print_world)


print_hello >> sleep >> print_world

The line that confuses me is

print_hello >> sleep >> print_world

What does >> mean in Python? I know bitwise operator, but can't relate to the code here.

smci
  • 32,567
  • 20
  • 113
  • 146
idazuwaika
  • 2,749
  • 7
  • 38
  • 46
  • 13
    Yes, `>>` is bitwise shift by default, but you can define it to be whatever you want on your own classes. Airflow has defined it to be a sequencing operator. – kindall Sep 18 '18 at 14:43
  • 1
    `+` is addition of numbers but also concatenation for strings or lists. This is the same. `>>` is bitwise shift for numbers but sequencing for airflow. – Giacomo Alzetta Sep 18 '18 at 14:45
  • 1
    aah operator overloading (if I recall the term correctly) . – idazuwaika Sep 18 '18 at 14:47
  • Yes, that is, indeed, correct. – N Chauhan Sep 18 '18 at 14:48
  • Tagged [tag:operators], [tag:bit-shift]. It's especially important to tag these, since they cannot be found by search – smci Jan 23 '20 at 15:52

1 Answers1

46

Airflow represents workflows as directed acyclic graphs. A workflow is any number of tasks that have to be executed, either in parallel or sequentially. The ">>" is Airflow syntax for setting a task downstream of another.

Diving into the incubator-airflow project repo, models.py in the airflow directory defines the behavior of much of the high level abstractions of Airflow. You can dig into the other classes if you'd like there, but the one that answers your question is the BaseOperator class. All operators in Airflow inherit from the BaseOperator. The __rshift__ method of the BaseOperator class implements the Python right shift logical operator in the context of setting a task or a DAG downstream of another.

See implementation here.

Alessandro Cosentino
  • 2,268
  • 1
  • 21
  • 30
rob
  • 559
  • 5
  • 5
  • 4
    The link to the line goes into the master branch. I suggest picking a specfic branch so that version changes will not affect the link. – tobi6 Sep 19 '18 at 06:37