6

I am attempting to pull a file from AWS S3, using Boto3, directly into a BytesIO object. This will eventually be used to manipulate the downloaded data but for now I'm just trying to give that file directly to a user via Flask. As I understand everything the below should work, but does not. The browser simply displays nothing (and shows only downloaded a few bytes of data).

(In this example, my sample file is a png)

from flask import Flask, send_from_directory, abort, Response, send_file, make_response
import boto3, botocore
import os
import io

AWS_ACCESS_KEY = os.environ['AWS_ACCESS_KEY'].rstrip()
AWS_SECRET_KEY = os.environ['AWS_SECRET_KEY'].rstrip()
S3_BUCKET = "static1"
app = Flask(__name__, static_url_path='/tmp')

@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def catch_all(path):
    s3 = boto3.client('s3', aws_access_key_id=AWS_ACCESS_KEY, aws_secret_access_key=AWS_SECRET_KEY,)
    file = io.BytesIO()
    metadata = s3.head_object(Bucket=S3_BUCKET, Key=path)
    conf = boto3.s3.transfer.TransferConfig(use_threads=False)
    s3.download_fileobj(S3_BUCKET, path, file)
    return send_file(file, mimetype=metadata['ContentType'])

if __name__ == '__main__':
     app.run(debug=True,port=3000,host='0.0.0.0')

If I modify that core routine to write the BytesIO object to disk, then read it back into a new BytesIO object - it works fine. As below:

def catch_all(path):
    s3 = boto3.client('s3', aws_access_key_id=AWS_ACCESS_KEY, aws_secret_access_key=AWS_SECRET_KEY,)
    file = io.BytesIO()
    metadata = s3.head_object(Bucket=S3_BUCKET, Key=path)
    conf = boto3.s3.transfer.TransferConfig(use_threads=False)
    s3.download_fileobj(S3_BUCKET, path, file)
    print(file.getvalue())
    fh = open("/tmp/test1.png","wb")
    fh.write(file.getvalue())
    fh.close()
    fh = open("/tmp/test1.png","rb")
    f2 = io.BytesIO(fh.read())
    fh.close
    print(f2.getvalue())
    return send_file(f2, mimetype=metadata['ContentType'])

Going around in circles with this for a few days, It's clear that I'm missing something and I'm not sure what. The script is being run inside a Python 3.8 docker container with the latest copies of boto3/flask/etc.

Jon
  • 573
  • 1
  • 6
  • 12

1 Answers1

7

Rewinding your BytesIO object should do the trick, with file.seek(0) just before send_file(...).

For the record I'm not sure your boto3/botocore calls are "best practices", to try your usecase I ended up with:

from boto3.session import Session

session = Session(
    aws_access_key_id=KEY_ID, aws_secret_access_key=ACCESS_KEY, region_name=REGION_NAME
)
s3 = session.resource("s3")


@base_bp.route("/test-stuff")
def test_stuff():
    a_file = io.BytesIO()
    s3_object = s3.Object(BUCKET, PATH)
    s3_object.download_fileobj(a_file)
    a_file.seek(0)
    return send_file(a_file, mimetype=s3_object.content_type)

It works on when reading the file from disk because you instanciate your BytesIO with the full content of the file, so it's properly fulfilled and still at "position 0".

b4stien
  • 1,810
  • 13
  • 14