9

I have an Airflow Http sensor that calls a REST endpoint and checks for a specific value in the JSON structure returned by the API

sensor = HttpSensor(
    soft_fail=True,
    task_id='http_sensor_check',
    http_conn_id='http_default',
    endpoint='http://localhost:8082/api/v1/resources/games/all',
    request_params={},
    response_check=lambda response: True if check_api_response(response) is True else False,
    mode='reschedule',
    dag=dag)

If the response_check is false, the DAG is put in a "up_for_reschedule" state. The issue is, the DAG stayed in that status forever and never got rescheduled.

My questions are:

  • What does "up_for_reschedule" means? and when would be the DAG rescheduled?
  • Let's suppose my DAG is scheduled to run every 5 minutes but because of the sensor, the "up_for_reschedule" DAG instance overlaps with the new run, will I have 2 DAGS running at the same time?

Thank you in advance.

OCDev
  • 655
  • 1
  • 7
  • 21

2 Answers2

13

In sensor mode='reschedule' means that if the criteria of the sensor isn't True then the sensor will release the worker to other tasks. This is very useful for cases when sensor may wait for a long time.

  1. up_for_reschedule means that the sensor condition isn't true yet and it hasnt reached timout so the task is waiting to be rescheduled by the scheduler.
  2. You don't know when the task will run. That depends on the scheduler (available resources, priorities etc..). If you don't want to allow parallel dag runs use max_active_runs=1 in DAG constructor.
Elad Kalif
  • 14,110
  • 2
  • 17
  • 49
0

Side note:

response_check=lambda response: True if check_api_response(response) is True else False,

is the same as:

response_check=lambda response: check_api_response(response),
  • Since the lambda takes in a value that is the only value passed into the function, you can do `response_check=check_api_response,` – NamshubWriter Jul 31 '23 at 15:25