Questions tagged [mwaa]

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed service for Apache Airflow that makes it easy for you to build and manage your workflows in the cloud.

you can easily combine data using any of Apache Airflow’s open source integrations.

use the same familiar Airflow platform as you do today to manage their workflows and now enjoy improved scalability, availability, and security without the operational burden of having to manage the underlying infrastructure

automatically scales capacity up to meet demand and back down to conserve resources and minimize costs

integrated with AWS security services to enable secure access

manages the provisioning and ongoing maintenance of Apache Airflow

https://docs.aws.amazon.com/mwaa/

291 questions
0
votes
1 answer

S3PrefixSensor in MWAA

I am using the S3PrefixSensor in MWAA v2.2.2, where I am waiting for some file to appear in S3 bucket. Here is my relevant code s3_success_flag_sensor = S3PrefixSensor( task_id='s3_success_flag', bucket_name=MY_BUCKET, prefix=MY_PREFIX, …
YuriR
  • 1,251
  • 3
  • 14
  • 26
0
votes
1 answer

MWAA Generated CLI Token Returns a 403 After A Successful Cli Token Generation

Description Code from datetime import datetime import requests import boto3 TODAYS_DATE = datetime.now() def main(): mwaa_client = boto3.client('mwaa') response = mwaa_client.create_cli_token(Name="dev-main-mwaa-env") headers = { …
AlexLordThorsen
  • 8,057
  • 5
  • 48
  • 103
0
votes
1 answer

401 From MWAA Airflow Environment When Attempting To Run A DAG

Description When attempting to make calls to a Managed Amazon Airflow instance I was not able to make API calls to the airflow environment despite being able to generate a CLI token from aws mwaa create-cli-token. Why am I getting a FORBIDDEN error…
AlexLordThorsen
  • 8,057
  • 5
  • 48
  • 103
0
votes
2 answers

How to pass the experiment configuration to a SagemakerTrainingOperator while training?

Idea: To use experiments and trials to log the training parameters and artifacts in sagemaker while using MWAA as the pipeline orchestrator I am using the training_config to create the dict to pass the training configuration to the Tensorflow…
Space4VV
  • 125
  • 9
0
votes
1 answer

Airflow - Unable to Create Table in Snowflake

I am running Airflow v2.2.2 using the MWAA service on AWS I have the following DAG import airflow from airflow import DAG from airflow.operators.python_operator import PythonOperator import time from datetime import timedelta import…
Damien
  • 4,081
  • 12
  • 75
  • 126
0
votes
1 answer

Best way to organize code for feature development in airflow

we are currently in the process of migration to AWS MWAA, and I would like to enable possibility for feature development in our dags to multiple developers in one environment at the same time. The idea is to create 'workspaces' for new features…
0
votes
1 answer

bash: spark-submit: command not found while executing dag in AWS- Managed Apache Airflow

I have to run a spark job, (I am new to spark) and getting following error- [2022-02-16 14:47:45,415] {{bash.py:135}} INFO - Tmp dir root location: /tmp [2022-02-16 14:47:45,416] {{bash.py:158}} INFO - Running command: spark-submit --class…
JaySean
  • 125
  • 1
  • 15
0
votes
1 answer

How to use wild card character to search for s3 file using S3KeySensor in airflow

Hello I am using S3KeySensor to look for parquet files created in specific partition. As the file names are generated from spark like (part-00499-e91c1af8-4352-4de9*), what should be the bucket_key ? Below code is failing. …
user3858193
  • 1,320
  • 5
  • 18
  • 50
0
votes
2 answers

Amazon MWAA and s3 connection

I'm migrating from on premises airflow to amazon MWAA 2.0.2. Currently I'm using an s3 connection which contains the access key id and secret key for s3 operations: { "conn_id" = "s3_default" "conn_type" : "S3" "extra" = { …
Nisman
  • 1,271
  • 2
  • 26
  • 56
0
votes
1 answer

Orchestration of Redshift Stored Procedures using AWS Managed Airflow

I have many redshift stored procedures created(15-20) some can run asynchronously while many have to run in a synchronous manner. I tried scheduling them in Async and Sync manner using Aws Eventbridge but found many limitations (failure handling and…
Aim_headshot
  • 111
  • 1
  • 6
0
votes
1 answer

Upserting and maintaing postgres table using Apache Airflow

Working on an ETL process that requires me to pull data from one postgres table and update data to another Postgres table in a seperate environment (same columns names). Currently, I am running the python job in a windows EC2 instance, and I am…
Saad
  • 11
  • 3
0
votes
1 answer

Table transfer stops after first iteration in Airflow

I use the following code to transfer small table from database A to database B with Airflow (MWAA): def move_data(sql_file_name, target_tbl_name, target_schema_name): select_stmt = "" dir_path = os.path.dirname(os.path.realpath(__file__)) …
BoIde
  • 306
  • 1
  • 3
  • 16
0
votes
1 answer

Broken DAG with MWAA (Airflow Managed Airflow) - psycopg2

With Managed Airflow 2.0.2, my requirements.txt looks like requirements.txt Apache-airflow[postgres] psycopg2-binary botocore When I try importing psycopg2, I get a broken DAG failing to recognize psycopg2 I already…
Sam
  • 1
0
votes
0 answers

How can I run DBT jobs with AWS Managed Airflow?

I wanted to know, how I can run my DBT jobs with AWS Managed Airflow. I have created an environment in AWS for testing followed this LINK they are failing. Also I have uploaded requirements.txt in S3 along with plugins.zip Content of…
Akash
  • 31
  • 1
  • 6
0
votes
1 answer

Managed Workflows with Apache Airflow (MWAA) - how to disable task run dependency on previous run

I have an Apache Airflow managed environment running in which a number of DAGs are defined and enabled. Some DAGs are scheduled, running on a 15 minute schedule, while others are not scheduled. All the DAGs are single-task DAGs. The DAGs are…
srm
  • 548
  • 1
  • 4
  • 19