0

We face a lot of our Airflow (MWAA) tasks receiving SIGTERM:

[2022-10-06 06:23:48,347] {{logging_mixin.py:104}} INFO - [2022-10-06 06:23:48,347] {{local_task_job.py:188}} WARNING - State of this instance has been externally set to success. Terminating instance.
[2022-10-06 06:23:48,348] {{process_utils.py:100}} INFO - Sending Signals.SIGTERM to GPID 2740
[2022-10-06 06:23:55,113] {{taskinstance.py:1265}} ERROR - Received SIGTERM. Terminating subprocesses.
[2022-10-06 06:23:55,164] {{process_utils.py:66}} INFO - Process psutil.Process(pid=2740, status='terminated', exitcode=1, started='06:23:42') (2740) terminated with exit code 1

It happens to a few of our tasks and it would not have been a big deal if the tasks were not set as a SUCCESS:

State of this instance has been externally set to success. Terminating instance

We understood that this can happen because of a lack of memory within the worker. We tried to increase the number of workers without any success. What would be our solutions to avoid having set tasks externally killed?

val
  • 329
  • 2
  • 16

1 Answers1

0

When tasks are getting killed, they are marked as failed. Here it seems to be the other way around. The task seem to get marked by something/someone as a success, after which the job is stopped/killed.

I am not aware of how Mwaa is deployed, but I would have a look at the action logging to see what/who is marking these tasks as success.

Jorrick Sleijster
  • 935
  • 1
  • 9
  • 22
  • There is no reason why the dag could be marked as a success by another task or another person. It happens with standalone tasks that have no interaction with others and I am 100% sure that noone marked them as a success from the UI/CLI. Could it be related to the fact that my tasks have no return value ? – val Oct 10 '22 at 15:56