We use airflow in a hybrid ETL system. By this I mean that some of our DAGs are not scheduled but externally triggered using the Airflow API.
We are trying to do the following: Have a sensor in a scheduled DAG (DAG1) that senses that a task inside an externally triggered DAG (DAG2) has run.
For example, the DAG1 runs at 11 am, and we want to be sure that DAG2 has run (due to an external trigger) at least once since 00:00. I have tried to set execution_delta = timedelta(hours=11) but the sensor is sensing nothing. I think the problem is that the sensor tries to look for a task that has been scheduled exactly at 00:00. This won't be the case, as DAG2 can be triggered at any time from 00:00 to 11:00.
Is there any solution that can serve the purpose we need? I think we might need to create a custom Sensor, but it feels strange to me that the native Airflow Sensor does not solve this issue.
This is the sensor I'm defining:
from datetime import timedelta
from airflow.sensors import external_task
sensor = external_task.ExternalTaskSensor(
task_id='sensor',
dag=dag,
external_dag_id='DAG2',
external_task_id='sensed_task',
mode='reschedule',
check_existence=True,
execution_delta=timedelta(hours=int(execution_type)),
poke_interval=10 * 60, # Check every 10 minutes
timeout=1 * 60 * 60, # Allow for 1 hour of delay in execution
)