0

I am trying to create a sequence of tasks like below using Airflow 2.3+

START -> generate_files -> download_file -> STOP

But instead I am getting below flow. The code is also given. Please advice.

enter image description here

from airflow import DAG
from airflow.decorators import task
from datetime import datetime
from airflow.operators.bash import BashOperator
from airflow.utils.dates import days_ago
from airflow.utils.trigger_rule import TriggerRule

with DAG('my_dag', start_date=days_ago(1), schedule_interval='@daily', catchup=False) as dag:

    START = BashOperator(task_id="start", bash_command='echo "starting batch pipeline"', do_xcom_push=False)
    STOP = BashOperator(task_id="stop", bash_command='echo "stopping batch pipeline"', trigger_rule=TriggerRule.NONE_SKIPPED, do_xcom_push=False)

    @task
    def generate_files():
        return ["file_1", "file_2", "file_3"]

    @task
    def download_file(file):
        print(file)

    START >> download_file.expand(file=generate_files()) >> STOP
Santanu Ghosh
  • 91
  • 1
  • 8

1 Answers1

0

Define the dag structure from START to generate_files explicitly

    files = generate_files()
    START >> files >> download_file.expand(file=files) >> STOP

DAG GRAPH

Oluwafemi Sule
  • 36,144
  • 1
  • 56
  • 81
  • this approach did not work , it gave attribute error: 'task decorator' object has no attribute 'update_relative' – Santanu Ghosh Oct 11 '22 at 05:53
  • @SantanuGhosh You need to invoke the function to create the operator instance. This same instance must be used in the dag structure after the START and passed to the mapped download_file task. – Oluwafemi Sule Oct 11 '22 at 09:33
  • thanks for the reply, however I have found alternate way to achieve that using TaskGroup – Santanu Ghosh Oct 11 '22 at 14:44
  • hey, Santanu can share your approach ? that would be a helpful share for the community – Vineet Oct 12 '22 at 10:28