12

I'm trying to make an Airflow task intentionally fail and error out by passing in a Bash line (thisshouldnotrun) that doesn't work. Airflow is outputting the following:

[2017-06-15 17:44:17,869] {bash_operator.py:94} INFO - /tmp/airflowtmpLFTMX7/run_bashm2MEsS: line 7: thisshouldnotrun: command not found
[2017-06-15 17:44:17,869] {bash_operator.py:97} INFO - Command exited with return code 127
[2017-06-15 17:44:17,869] {models.py:1417} ERROR - Bash command failed
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/models.py", line 1374, in run
    result = task_copy.execute(context=context)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/operators/bash_operator.py", line 100, in execute
    raise AirflowException("Bash command failed")
AirflowException: Bash command failed
[2017-06-15 17:44:17,871] {models.py:1433} INFO - Marking task as UP_FOR_RETRY
[2017-06-15 17:44:17,878] {models.py:1462} ERROR - Bash command failed
Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/airflow", line 28, in <module>
    args.func(args)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/bin/cli.py", line 585, in test
    ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/utils/db.py", line 53, in wrapper
    result = func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/models.py", line 1374, in run
    result = task_copy.execute(context=context)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/operators/bash_operator.py", line 100, in execute
    raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed

Will Airflow send an email for these kind of errors? If not, what would be the best way to send an email for these errors?

I'm not even sure if airflow.cfg is setup properly... Since the ultimate goal is to test the email alerting notification, I want to make sure airflow.cfg is setup properly. Here's the setup:

[email]
email_backend = airflow.utils.email.send_email_smtp


[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = emailsmtpserver.region.amazonaws.com 
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
# smtp_user = airflow_data_user
# smtp_password = password
smtp_port = 587 
smtp_mail_from = airflow_data_user@domain.com

What is smtp_starttls? I can't find any info for it in the documentation or online. If we have 2-factor authentication needed to view emails, will that be an issue here for Airflow?

Here's my Bash command:

task1_bash_command = """
export PATH=/home/ubuntu/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
export rundate=`TZ='America/Los_Angeles' date +%F -d "yesterday"`
export AWS_CONFIG_FILE="/home/ubuntu/.aws/config"

/home/ubuntu/bin/snowsql -f //home/ubuntu/sql/script.sql 1> /home/ubuntu/logs/"$rundate"_dev.log 2> /home/ubuntu/logs/"$rundate"_error_dev.log

if [ -e /home/ubuntu/logs/"$rundate"_error_dev.log ]
then
    exit 64
fi

And my task:

task1 = BashOperator(
    task_id = 'run_bash',
    bash_command = task1_bash_command,
    dag = dag,
    retries = 2,
    email_on_failure = True,
    email = 'username@domain.com')
simplycoding
  • 2,770
  • 9
  • 46
  • 91

3 Answers3

15

smtp_starttls basically means Use TLS

Set this to False and set smtp_ssl to True if you want to use SSL instead. You probably need smtp_user and smtp_password for either.

Airflow will not handle 2 step authentication. However, is you are using AWS you likely don't need it as your SMTP (SES) credentials are different from your AWS credentials.

See here.

EDIT: For airflow to send an email on failure, there are a couple things that need to be set on your task, email_on_failure and email.

See here for example:

def throw_error(**context):
    raise ValueError('Intentionally throwing an error to send an email.')



t1 = PythonOperator(task_id='throw_error_and_email',
                    python_callable=throw_error,
                    provide_context=True,
                    email_on_failure=True,
                    email='your.email@whatever.com',
                    dag=dag)
jhnclvr
  • 9,137
  • 5
  • 50
  • 55
  • Thanks for the clarification. I'm still trying to understand what types of errors Airflow will or will not catch - is my example out of Airflow's scope? – simplycoding Jun 15 '17 at 19:42
  • It should catch any task failure, but you have to define your task in a certain way. Please see my edit to my answer for an example. – jhnclvr Jun 15 '17 at 19:49
  • Yeah I think my problem is trying to make Bash throw an error, so nothing Airflow related – simplycoding Jun 15 '17 at 20:10
  • do you know what Airflow considers a Bash error? When I invoke that `thisshouldnotrun` command, the system returns a `127` error as expected, but Airflow doesn't seem to take it as a failure. It doesn't even retry the task. Any idea how to make it throw a full error? – simplycoding Jun 15 '17 at 20:39
  • What status is the task in when it completes? A bash command that should do what you want would be: `exit 1` – jhnclvr Jun 15 '17 at 20:41
  • Right, is it normal for Bash to continue through commands despite errors? It seems to recognize that `thisshouldnotrun` is not a valid command, but it continues to the next line, so no error seems to be thrown. Side question - would email notifications go out when running `airflow test`? – simplycoding Jun 15 '17 at 20:46
  • I think I've got my email settings and the `airflow.cfg` setup wrong, or `airflow test` doesn't email on failure – simplycoding Jun 15 '17 at 23:16
  • It should stop on an error and fail the task in which the error occurred. Maybe post some of your dag definition file for clarity, but it should fail that BashOperator at least when running the dag. Not sure about airflow.test. When a task fails and you have it setup to email, it should log an error in the task log if it tried to send a mail and failed. – jhnclvr Jun 16 '17 at 13:46
  • I added my bash code since I'm still having issues, and I think our email smtp configurations are setup as well. When I run the dag, I see Airflow `Starting attempt 1 of 3`, catch the `exit 64`, and output errors which end with `Bash command failed`. Thoughts? – simplycoding Jun 20 '17 at 17:22
  • It's still not sending the email though? It should log an error in that same log about why it couldn't send the email if it tried. – jhnclvr Jun 20 '17 at 19:41
  • You mean the log files where I'm redirecting from `stderr` and `stdout`? I looked at those files and didn't see anything related to emails and whatnot. I tried looking under `/home/ubuntu/airflow/logs` but don't see any logs there made for this dag either. I should also mention that this is all through running `airflow test` if that makes any difference – simplycoding Jun 20 '17 at 20:57
  • I mean in the UI, you can click "view log", you should see your smtp error in the log for this particular task failure. – jhnclvr Jun 20 '17 at 21:07
  • Ok, I looked under `localhost:8080/admin/log` and see the dag_id but no events related to email or smtp. If that's what you're talking about, I don't see anything, but just a bunch of other UI-related events – simplycoding Jun 20 '17 at 21:30
  • In the UI, go to the graph view of your dag run. (Click on the dag ID, and then in the drop down select the specific run) Click the task that should send the email, and then click "View Log". It should be in that log – jhnclvr Jun 21 '17 at 13:23
  • 1
    I didn't read all these comments but just a suggestion you don't have to use the BashOperator to run your bash commands. What I use is a PythonOperator and I do everything in Python. You can use `subprocess.run(...)` or one of the variations of that Python library to run linux commands. Then you can evaluate the return value yourself and manually throw an `AirflowException` if the value is not what you expected. If an AirflowException is thrown it'll always mark the task as failed. So don't feel restrained to operators be creative :) – Kyle Bridenstine Nov 09 '18 at 15:35
1

If we have 2-factor authentication needed to view emails, will that be an issue here for Airflow?

You can use google app password to get your way around 2 factor authentication

https://support.google.com/mail/answer/185833?hl=en-GB

Source - https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-env-variables.html

enter image description here

Smit Thakkar
  • 410
  • 2
  • 8
  • 16
0

Use below link for creating airflow dag.
How to trigger daily DAG run at midnight local time instead of midnight UTC time

Approach 1 : You can setup SMTP locally and make it send email on jobs failure.

[email]
email_backend = airflow.utils.email.send_email_smtp

[smtp]
smtp_host = localhost
smtp_starttls = False
smtp_ssl = False
smtp_port = 25
smtp_mail_from = noreply@company.com

Approach 2 : You can use Gmail to send email. I have written an article to do this. https://helptechcommunity.wordpress.com/2020/04/04/airflow-email-configuration/

Ganesh
  • 677
  • 8
  • 11