3

we've launched a private MWAA environment. We are able to access the UI, but we're having some trouble installing our python requirements. MWAA picks up the requirements file from S3, but runs into a timeout when trying to install the python packages.

This is expected, because we're behind a proxy, so my question would be: how do we tell MWAA to use our proxy while installing our python dependencies?

This is what our CloudWatch logstream (requirements_install_ip*) tells us:

WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) 
after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection
object at 0x7fda26b394d0>, 'Connection to pypi.org timed out. (connect timeout=15)')'
dovregubben
  • 364
  • 2
  • 16
  • Did you try to add `--proxy http://user:password@proxyserver:port` as the first line of your `requirements.txt` file? I believe pip supports that format to pass the extra options. – jphuynh Sep 06 '21 at 19:51
  • I tried that with pip 21.2.4 but the ```--proxy``` flag is not allowed in the requirements file and throws an ```ERROR: Invalid requirement: --proxy http://SOME_HOST:SOME_PORT``` – dovregubben Sep 09 '21 at 06:51

2 Answers2

2

We have contacted AWS support on this, and apparently there is no such option as to pass the proxy variable. So we placed a feature request.

Even though I'm not sure if this is going to be implemented at all, anybody interested in this may feel free to subscribe to the MWAA document history feed.

dovregubben
  • 364
  • 2
  • 16
0

You can set this in your pip.ini

[global]
index = https://eg.nexus.repo.url
index-url = https://eg.nexus.repo.url

To get where your pip.ini is, you can do:

pip config -v list
paradox
  • 634
  • 5
  • 13
  • Sorry, I can't follow. Afaik, we are not able to edit any files on the MWAA instances since its a managed AWS service. Let me know if I'm wrong. Even if we were able to edit the pip.ini, it should be something like this, I guess: [global] proxy = http://user:password@proxy_name:port – dovregubben Aug 27 '21 at 10:40
  • Ok, then maybe I didn't understand your question. If MWAA is installing your dependencies on its environment then what does your proxy has to do with it? – paradox Sep 05 '21 at 06:43
  • We can provide MWAA with a requirements file for PIP and MWAA will try to install the required packages upon initialization. In our case it fails to do so because our traffic has to go through a proxy and MWAA isn't able to reach pypi.org (as shown by the logstream) – dovregubben Sep 06 '21 at 08:43
  • Where is MWAA installed? I mean, is it on Amazon's infrastructure or are you routing it through your proxy somehow? If the latter, then check what you allow in your proxy or give me some more info on how routing is done on your side. – paradox Sep 07 '21 at 09:36
  • Also, have you read this: https://docs.aws.amazon.com/mwaa/latest/userguide/working-dags-dependencies.html#working-dags-dependencies-prereqs – paradox Sep 07 '21 at 09:40
  • 1
    MWAA is running on AWS infrastructure. Our VPC has no internet gateway, so all internet traffic such as querying ```pypi.org``` has to go through a proxy server in another AWS account. I guess, MWAA is simply missing a parameter that would pass the proxy as an argument to ```pip install --proxy ...```. Our workaround for the time being is to put all required packages into wheels, add them to the ```plugins.zip``` which will be uploaded along with the ```requirements.txt```, and referencing those (local) wheels files in the requirements file – dovregubben Sep 09 '21 at 07:05