8

Running Airflow 1.9.0 with python 2.7. How do I gracefully stop a DAG?

In this case, I have a DAG that's running a file upload with bad code that causes everything to take 4 times as long, and I'd really prefer not to have to wait a day for it to finally time out (timeout is set to 10 hours).

The DAG looks for a tar file. When it finds one, it goes through every file in the tar, looking for files to process, and processes them.

I could not find any way to stop the DAG. I tried clicking on the "Running" circle in the "DAG Runs" column (the one to the right). It let me select the process and mark it as "failed". But it didn't stop running.

I tried clicking on the "Running" circle in the "Recent Tasks" column (the one to the left). It let me select processes, but trying to set them to filed (or to success), generated an exception in Airflow

Greg Dougherty
  • 3,281
  • 8
  • 35
  • 58
  • Can you elaborate a bit. How your upload works? I can see few approaches. 1. You have a DAG with a task which in a loop goes trough a file list and actually upload them. 2. You have almost the same DAG but you trigger it for each file to upload, then you deal with dag_runs. The first case you can pause the DAG second you can mark a run as a failed. – Andrey Kartashov Mar 03 '18 at 16:48
  • @AndreyKartashov I tried marking it as failed. Marking it in "DAG Runs" 'worked', but it kept on running. Marking it in generated an exception in Airflow – Greg Dougherty Mar 04 '18 at 17:15
  • I see. We have the same issue with docker, no matter what you do with the DAG it will not kill the docker. So I mark a DAG run as failed and wait. But I'm looking for solution to actually kill that process. – Andrey Kartashov Mar 05 '18 at 02:17
  • 1
    I have not tried to mark the dag run as failed and clear the task state. Clear the task state might kill all the subprocesses. – Andrey Kartashov Mar 05 '18 at 02:19
  • 2
    I tried. it failed. I eventually added code to my DAG so I can force it to stop from the outside. I will confess it makes absolutely no sense to me why it's so difficult to stop a DAG. Has no one ever made a DAG that did something wrong? – Greg Dougherty Mar 17 '18 at 00:26
  • Wrong usually makes step failed and execution stops by it self. – Andrey Kartashov Mar 17 '18 at 01:16
  • 1
    @AndreyKartashov I put the wrong file in the upload directory and now my DAG is processing it. Processing time is 40 hours. It is doing the wrong thing, and I want it to stop. Why is this a challenging concept? Or, there's a bug in my code, and it's deleting every file, even when the upload fails. I need to stop it and run better code. There is absolutely NO reason to believe that "set of DAGs that crash" is a complete superset of "set of DAGs with problems" – Greg Dougherty Mar 26 '18 at 20:44

3 Answers3

9

Browse -> DAG Runs -> Checkbox -> With Selected "Delete"

Aclwitt
  • 134
  • 1
  • 8
3

If you constructed your process the way it sounds like you did, you won't be able to stop it from Airflow. You'll need to find the process identifier that is executing and forcibly terminate it to get it to actually stop.

joebeeson
  • 4,159
  • 1
  • 22
  • 29
2

You can add a Sensor Operator which runs in parallel with your Task which you want killed.

The sensor should monitor the value of some Variable and should complete when it detects a certain value in the Variable, like 'STOP' let's say.

The Sensor should be followed by a BashOperator Task which shoul d kill your long running Task using a command like below:

kill -9 $(ps -ef| grep | grep | grep -v grep | awk '{print $2;}')

Adrian
  • 171
  • 1
  • 3