13

I suspected that

airflow run dag_id task_id execution_date

would run all upstream tasks, but it does not. It will simply fail when it sees that not all dependent tasks are run. How can I run a specific task and all its dependencies? I am guessing this is not possible because of an airflow design decision, but is there a way to get around this?

sophros
  • 14,672
  • 11
  • 46
  • 75
itzjustricky
  • 423
  • 1
  • 4
  • 14
  • I am facing the same problem but don't have a solution. I think there might be a way to get around this using UI and running a task from a dag that is n running state. – nehiljain Mar 06 '17 at 22:01

2 Answers2

9

You can run a task independently by using -i/-I/-A flags along with the run command.

But yes the design of airflow does not permit running a specific task and all its dependencies.

You can backfill the dag by removing non-related tasks from the DAG for testing purpose

Priyank Mehta
  • 2,453
  • 2
  • 21
  • 32
  • 3
    Then what is the best alternative to airflow to achieve the goal (run a task and its dependencies from a DAG)? – Hailiang Zhang Apr 11 '17 at 18:19
  • Azkaban has this feature. in the execute flow popup you can disable any job or disable its upstream dependencies or downstream dependencies and then execute the flow. https://azkaban.readthedocs.io/en/latest/useAzkaban.html#executing-flow-view – ismail Jul 03 '19 at 10:06
1

A bit of a workaround but in case you have given your tasks task_id-s consistently you can try the backfilling from Airflow CLI (Command Line Interface):

airflow backfill -t TASK_REGEX ... dag_id

where TASK_REGEX corresponds to the naming pattern of the task you want to rerun and its dependencies.

(remember to add the rest of the command line options, like --start_date).

sophros
  • 14,672
  • 11
  • 46
  • 75