4

AWS MWAA (Managed Workflows for Apache Airflow) is relatively new service provided by AWS. When configuring the MWAA environment, it is possible to provide custom requirements.txt file, which is used to install additional Python packages in that environment.

In the company I work for, we use AWS CodeArtifact for custom PyPi package repository, where we upload private Python packages. We want to use some of them in Airflow DAGs. That's why I was wondering if MWAA environment can be configured somehow to use the PyPi repository from CodeArtifact?

Or is there any way how to install custom Python packages (not in public PyPi) in MWAA environment?

  • Can you please refer to this: https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html, I see there is a section where you can add a hosted url which can be your codeartifact registry url. – Eman Mar 23 '21 at 04:39
  • Thank you. However it's not working since the only supported authentication method is Basic auth (username, password), e.g.: `--index-url=https://${AIRFLOW__FOO_USER}:${AIRFLOW__FOO_PASS}@my.privatepypi.com`. CodeArtifact uses a token as password which has to be renewed every 12 hours. – Peter Jakubčo Mar 23 '21 at 14:12

2 Answers2

1

Didn't try but should work:

# aws codeartifact login --tool pip --domain **--repository **
# awk '/index-url/ {print "-i "$3}' ~/.config/pip/pip.conf  > requirements.txt
# echo <my python package> >> requirements.txt

It can be used as normal requirements file:

# pip3 install -r requirements.txt 
Looking in indexes: https://aws:****

Please note CodeArtifact token expires maximum in 12 hours. You can create a recurring job to regenerate this file...

1

Although a bit late, AWS does provide a sample repo to fulfill your requirements. The example makes use of the AWS CDK Python to setup the VPC endpoint to Codeartifacts and your private VPC. Although the Python Lambda version is 3.7 (deprecated already), you can update it to Python 3.10 (latest) and updates the AWS CDK Python version without any issues. I recently also takes this repo and have my private MWAA made use of the codeartifact without issue. You may also update the S3 bucket and Path according to your requirement.

https://github.com/aws-samples/amazon-mwaa-examples/blob/main/usecases/mwaa-with-codeartifact/README.md

To install aws-cdk Nodejs

yarn add --global aws-cdk

npx cdk deploy CodeArtifactStack LambdaCronStack # Require LTS nodejs
whyuenacccc
  • 55
  • 1
  • 6