-1

I am trying to trigger multiple external dag dataflow job via master dag.

I plan to use TriggerDagRunOperator and ExternalTaskSensor . I have around 10 dataflow jobs - some are to be executed in sequence and some in parallel . For example: I want to execute Dag dataflow jobs A,B,C etc from master dag and before execution goes next task I want to ensure the previous dag run has completed. But I am having issues with importing ExternalTaskSensor module. Is their any alternative path to achieve this ?

Note: Each Dag eg A/B/C has 6- 7 task .Can ExternalTaskSensor check if the last task of dag A has completed before DAG B or C can start.

Sakshi Gatyan
  • 1,903
  • 7
  • 13
  • Hi @recyclinguy, It is recommended to provide some sample code while asking questions. See how to [ask](https://stackoverflow.com/help/how-to-ask) questions. Could you also provide the error messages you’re getting while importing the ExternalTaskSensor module. – Prajna Rai T Sep 07 '21 at 14:49

1 Answers1

1

I Used the below sample code to run dag’s which uses ExternalTaskSensor, I was able to successfully import the ExternalTaskSensor module.

import time
from datetime import datetime, timedelta
from pprint import pprint

from airflow import DAG
from airflow.operators.dagrun_operator import TriggerDagRunOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.sensors.external_task_sensor import ExternalTaskSensor
from airflow.utils.state import State

sensors_dag = DAG(
    "test_launch_sensors",
    schedule_interval=None,
    start_date=datetime(2020, 2, 14, 0, 0, 0),
    dagrun_timeout=timedelta(minutes=150),
    tags=["DEMO"],
)

dummy_dag = DAG(
    "test_dummy_dag",
    schedule_interval=None,
    start_date=datetime(2020, 2, 14, 0, 0, 0),
    dagrun_timeout=timedelta(minutes=150),
    tags=["DEMO"],
)


def print_context(ds, **context):
    pprint(context['conf'])


with dummy_dag:
    starts = DummyOperator(task_id="starts", dag=dummy_dag)
    empty = PythonOperator(
        task_id="empty",
        provide_context=True,
        python_callable=print_context,
        dag=dummy_dag,
    )
    ends = DummyOperator(task_id="ends", dag=dummy_dag)

    starts >> empty >> ends

with sensors_dag:
    trigger = TriggerDagRunOperator(
        task_id=f"trigger_{dummy_dag.dag_id}",
        trigger_dag_id=dummy_dag.dag_id,
        conf={"key": "value"},
        execution_date="{{ execution_date }}",
    )
    sensor = ExternalTaskSensor(
        task_id="wait_for_dag",
        external_dag_id=dummy_dag.dag_id,
        external_task_id="ends",
        poke_interval=5,
        timeout=120,
    )
    trigger >> sensor

In the above sample code, sensors_dag triggers tasks in dummy_dag using the TriggerDagRunOperator(). The sensors_dag will wait till the completion of the specified external_task in dummy_dag.

Prajna Rai T
  • 1,666
  • 3
  • 15