0

I would like the connect to aws s3 without making use of the Admin-> configuration UI of Airflow. Can any one ,please let me know is there a way to do so in Apache Airflow?

peerpressure
  • 390
  • 4
  • 19

1 Answers1

0

Since you want to connect to AWS S3 without using the default s3 operator in Airflow, You can use the PythonOperator and make sure boto is added to the python dependencies which Airflow is using.

For installing boto you can use pip install boto3

If you are using docker you have to add boto in the docker file.


from airflow.operators import PythonOperator
from airflow.models import DAG
from datetime import datetime, timedelta
from datetime import datetime, timedelta
import time




args = {
    'owner': 'airflow_default',
    'start_date': datetime(2022,6,6)
}

def s3_connect(ds, **kwargs):
    '''' Connect to s3 using Boto script '''
    s3 = boto3.resource(
    service_name='s3',
    region_name='us-east-2',
    aws_access_key_id='mykey',
    aws_secret_access_key='mysecretkey'
    )
    for bucket in s3.buckets.all():
        print(bucket.name)


def my_dummy_function():
    '''  Dummy function for doing nothing '''
    return 'dummy function test'

dag = DAG(
    dag_id='example_python_operator_without_s3', default_args=args,
    schedule_interval=None)

run_this = PythonOperator(
    task_id='print_the_context',
    provide_context=True,
    python_callable=s3_connect,
    dag=dag)

also_this =  PythonOperator(task_id="dummy_function", python_callable=my_dummy_function,dag= dag)

run_this>>also_this
peerpressure
  • 390
  • 4
  • 19