I am trying to make a lambda function that runs inside AWS to copy a file from S3 to Azure using the "azcopy copy" command (V10). To do that, I've created a docker image with azcopy installed inside it. I am required to use an external proxy (something like http://proxy-something.azure.aztec.cloud.something:80), so the other party can control access to my IP address only.
My Dockerfile is something like this:
FROM public.ecr.aws/lambda/python:3.9
RUN yum -y update && \
yum -y install \
gzip \
wget \
tar \
&& yum clean all
RUN wget https://aka.ms/downloadazcopy-v10-linux && \
tar -vxf downloadazcopy-v10-linux && \
cd azcopy_linux_amd64_10.14.1 && \
mv azcopy .. && rm NOTICE.txt && cd .. && \
rmdir azcopy_linux_amd64_10.14.1/ && \
rm downloadazcopy-v10-linux
COPY requirements.txt ./
RUN python3 -m pip install --trusted-host pypi.org --upgrade pip \
&& python3 -m pip install --trusted-host pypi.org --upgrade wheel \
&& python3 -m pip install --trusted-host pypi.org --upgrade setuptools \
&& python3 -m pip install boto3 \
&& python3 -m pip install awslambdaric \
&& python3 -m pip install -r ./requirements.txt
COPY app.py ./
RUN chmod 644 $(find . -type f) && \
chmod 755 $(find . -type d) && \
chmod +x azcopy
RUN yum -y install net-tools
ENV AWS_ACCESS_KEY_ID <mykey>
ENV AWS_SECRET_ACCESS_KEY <mysecret>
ENV AWS_REGION sa-east-1
ENV DEBUG TRUE
ENV LINK "'https://<my_azcopy_link>"
ENV PROXY http://proxy-something.azure.aztec.cloud.something:80
ENV HTTP_PROXY http://proxy-something.azure.aztec.cloud.something:80
ENV HTTPS_PROXY http://proxy-something.azure.aztec.cloud.something:80
EXPOSE 8080
EXPOSE 80
EXPOSE 443
CMD [ "app.handler" ]
My python code just loads the file from my S3 bucket and calls azcopy like this:
AZCOPY_LOG_LOCATION=/tmp AZCOPY_CONCURRENCY_VALUE=2 AZCOPY_CONCURRENT_FILES=1 /var/task/azcopy copy . 'https://<mylink>' --recursive --cap-mbps 20
Outside my docker image, running this command directly seems to work. However, inside docker he gets stuck without ever transferring anything (my timeout is 5 min.):
START RequestId: 2bb96291-240b-4d35-a2bd-0400df4366e0 Version: $LATEST
INFO: Scanning...
INFO: Proxy detected: https://azcopyvnextrelease.blob.core.windows.net -> http://proxy-something.azure.aztec.cloud.something:80
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
Job b44c7ab0-c0cf-0d44-7466-7a4283a7a5bc has started
Log file is located at: /tmp/b44c7ab0-c0cf-0d44-7466-7a4283a7a5bc.log
INFO: Proxy detected: https://external_addr.blob.core.windows.net -> http://proxy-something.azure.aztec.cloud.something:80
0.0 %, 0 Done, 0 Failed, 2 Pending, 0 Skipped, 2 Total, 09 May 2022 13:30:31,183 [WARNING] (rapid) Reset initiated: Timeout
END RequestId: 2bb96291-240b-4d35-a2bd-0400df4366e0
REPORT RequestId: 2bb96291-240b-4d35-a2bd-0400df4366e0 Duration: 300000.00 ms Billed Duration: 300000 ms Memory Size: 3008 MB Max Memory Used: 3008 MB
I can't make this code work even in my local machine. If I run this lambda in the cloud, the same thing happens.
What am I doing wrong? What could I do?