I have a setup in AWS where I have a python lambda proxying an s3 bucket containing .tar.gz files. I need to return the .tar.gz file from the python lambda back through the API to the user.
I do not want to untar the file, I want to return the tarfile as is, and it seems the tarfile
module does not support reading in as bytes.
I have tried using python's .open
method (which returns a codec error in utf-8). Then codecs.open
with errors
set to both ignore
and replace
which leads to the resulting file not being recognized as .tar.gz
Implementation (tar binary unpackaging)
try:
data = client.get_object(Bucket=bucket, Key=key)
headers['Content-Type'] = data['ContentType']
if key.endswith('.tar.gz'):
with open('/tmp/tmpfile', 'wb') as wbf:
bucketobj.download_fileobj(key, wbf)
with codecs.open('/tmp/tmpfile', "rb",encoding='utf-8', errors='ignore') as fdata:
body = fdata.read()
headers['Content-Disposition'] = 'attachment; filename="{}"'.format(key.split('/')[-1])
Usage (package/aws information redacted for security)
$ wget -v https://<apigfqdn>/release/simple/<package>/<package>-1.0.4.tar.gz
$ tar -xzf <package>-1.0.4.tar.gz
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now