0

I'm looking to create a DAG that can run two tasks. The ideal workflow would be to create a PEM key file then SSH using that newly created file.

One task would be with the PythonOperator to call AWS Secrets Manager through the SecretsManagerHook. This would grab a PEM key that is stored there and write a temporary file.

The second task would use the SSHOperator and in the ssh_hook it would reference the PEM file.

I seem to be able to get the temporary file written to a location but I can't access it in the next task with the SSHOperator.

# USED TO GET PEM KEY FROM AWS SECRETS MANAGER

def retrieve_pem_key_from_aws_secrets_manager(**kwargs):
    secrets_manager_hook = SecretsManagerHook()
    sm_client = secrets_manager_hook.get_conn()
    secret = sm_client.get_secret_value(SecretId='<SECRET>')
    pem_key_value = secret["SecretString"]

    pem_file = tempfile.NamedTemporaryFile(mode='w+', encoding='UTF-8')
    pem_file.write(pem_key_value)
    pem_file.seek(0)
    return(pem_file.name)

PythonOperator task:

task1=PythonOperator(
        task_id='create_pem_key',
        python_callable=retrieve_pem_key_from_aws_secrets_manager
    )

This outputs something such as /tmp/<FILE_NAME>

My SSHOperator task:

task2=SSHOperator(
        task_id='test_ssh_connectivity',
        ssh_conn_id=None,
        ssh_hook=SSHHook(ssh_conn_id=None, remote_host=<HOST>, username="ec2-user", key_file=retrieve_pem_key_from_aws_secrets_manager()),
        command='echo Hello'
    )

This file referenced below is different from the file generated in task1.

airflow.exceptions.AirflowException: SSH operator error: [Errno 2] No such file or directory: '/tmp/tmpt6gh71_b'

It seems that both tasks run on the same host in AWS, but my understanding of MWAA is that Fargate containers are used with task execution so it could be possible that these are running on different containers. Is it possible to run both tasks together?

I additionally do have the option of using the Secrets Manager backend but I have not explored that yet. I am not sure if that will help in this scenario.

Is there an ideal way to write these tasks so that I can just use Python code to do what I need with minimal configuration of my Airflow environment?

rk92
  • 551
  • 4
  • 21
  • A similar question is asked here: https://stackoverflow.com/questions/71530558/airflow-sshoperator-how-to-securely-access-pem-file-across-tasks – Andrew Nguonly Jun 21 '22 at 14:25

1 Answers1

0

Since you already have a secret backend and using an SSH operator. The easiest way I can think of is configuring a secret backend in Airflow and using Airflow Connections to feed pem files into the operator.

It might look like a huge task to setup initially, but it is going to save you a lot of complexity downstream

You can follow the documentation on how to set the secrets backend

https://docs.aws.amazon.com/mwaa/latest/userguide/connections-secrets-manager.html

Bhavani Ravi
  • 2,130
  • 3
  • 18
  • 41