5

I've used the argument files =["abc.txt"]. I got the info from the airflow docs...https://airflow.readthedocs.io/en/stable/_modules/airflow/operators/email_operator.html

But I'm getting the error that the file is not found. My question is from where do this airflow will pick my file. Is it from GCS Bucket or DAG folder in the composer environment?

Where I need to upload a file and what is the correct syntax for 'files' argument?

Thanks in advance.

Tlaquetzal
  • 2,760
  • 1
  • 12
  • 18

1 Answers1

8

The files are taken from the local file system and the files argument is indeed a list of strings as files =["abc.txt"].

To upload the files to composer, you can use the data folder inside your Composer Environment GCS bucket, then you can access this data from /home/airflow/gcs/data/

An example taken from the documentation, with the files property added is:

from airflow.operators import email_operator
    # Send email confirmation
    email_summary = email_operator.EmailOperator(
        task_id='email_summary',
        to=models.Variable.get('email'),
        subject='Sample Email',
        html_content="HTML content",
        files=['/home/airflow/gcs/data/abc.txt'])

You can also download a file to the local system, and then send it; however, you'll need to ensure that the downloaded file and the email operators are executed by the same worker.

Tlaquetzal
  • 2,760
  • 1
  • 12
  • 18
  • How to attach a file when it is in another bucket but not in the Airflow composer bucket? – Chakkirala Chaitanya Jan 16 '20 at 05:04
  • 1
    You need to download the file to the local system first. You can use the [GoogleCloudStorageDownloadOperator](https://github.com/apache/airflow/blob/1.10.3/airflow/contrib/operators/gcs_download_operator.py) to download the file, and then you can send it. You only have to make sure that both tasks are [executed by the same worker](https://stackoverflow.com/q/45842564/7517757) so the EmailOperator can find the file on the local system. – Tlaquetzal Jan 16 '20 at 16:14
  • This indeed takes more time if the file size is huge.Is there any solution that we can directly work with relative GCS bucket instead of downloading to local system.As per my requirement, i shouldn't involve local system into this issue. – Chakkirala Chaitanya Jan 17 '20 at 11:08
  • You cannot attach a file to an email with only a Google Cloud Storage URI. If you don't want to download the file, the option would be to send the file URL to download the file directly from GCS instead of the actual file. You can use [Signed URLs](https://cloud.google.com/storage/docs/access-control/signed-urls) for this. Also, consider that if the file is really big, it might also hit the limits of attachments for the email, so sending URLs seems to be a good approach. – Tlaquetzal Jan 17 '20 at 17:15