0

I need to download large files and write them to S3. Instead of downloading the files to local hard drive and copy to S3, is it possible to stream the files directly to S3?

I found the following code from https://www.python-httpx.org/advanced/. How to write the chunk to S3?

import tempfile

import httpx
from tqdm import tqdm

with tempfile.NamedTemporaryFile() as download_file:
    url = "https://speed.hetzner.de/100MB.bin"
    with httpx.stream("GET", url) as response:
        total = int(response.headers["Content-Length"])

        with tqdm(total=total, unit_scale=True, unit_divisor=1024, unit="B") as progress:
            num_bytes_downloaded = response.num_bytes_downloaded
            for chunk in response.iter_bytes():
                download_file.write(chunk)
                progress.update(response.num_bytes_downloaded - num_bytes_downloaded)
                num_bytes_downloaded = response.num_bytes_downloaded
ca9163d9
  • 27,283
  • 64
  • 210
  • 413
  • See https://stackoverflow.com/questions/42293270/list-parts-in-a-multipart-upload-using-boto3-with-python for a way to do it with MultiPartUpload – kgiannakakis Nov 18 '21 at 08:00
  • Lambda can run up to 15 minutes, and you just need a pre-signed URL to upload to the bucket. https://docs.aws.amazon.com/code-samples/latest/catalog/python-s3-s3_basics-presigned_url.py.html – Sharuzzaman Ahmat Raslan Nov 18 '21 at 09:02

0 Answers0