Download large files and stream them to S3?

Question

I need to download large files and write them to S3. Instead of downloading the files to local hard drive and copy to S3, is it possible to stream the files directly to S3?

I found the following code from https://www.python-httpx.org/advanced/. How to write the chunk to S3?

import tempfile

import httpx
from tqdm import tqdm

with tempfile.NamedTemporaryFile() as download_file:
    url = "https://speed.hetzner.de/100MB.bin"
    with httpx.stream("GET", url) as response:
        total = int(response.headers["Content-Length"])

        with tqdm(total=total, unit_scale=True, unit_divisor=1024, unit="B") as progress:
            num_bytes_downloaded = response.num_bytes_downloaded
            for chunk in response.iter_bytes():
                download_file.write(chunk)
                progress.update(response.num_bytes_downloaded - num_bytes_downloaded)
                num_bytes_downloaded = response.num_bytes_downloaded

See https://stackoverflow.com/questions/42293270/list-parts-in-a-multipart-upload-using-boto3-with-python for a way to do it with MultiPartUpload — kgiannakakis, Nov 18 '21 at 08:00
Lambda can run up to 15 minutes, and you just need a pre-signed URL to upload to the bucket. https://docs.aws.amazon.com/code-samples/latest/catalog/python-s3-s3_basics-presigned_url.py.html — Sharuzzaman Ahmat Raslan, Nov 18 '21 at 09:02

Download large files and stream them to S3?

0 Answers0