3

I have a problem uploading big files and find a usuable ContentMD5 method in order to supply transfer verification

I started with client.upload_file. This method has no ContentMD5 Parameter. So I tried using a function to generate a local ETag for the file and verify it with the transfered file.

I found that if you use KMS encryption in your S3 bucket, that your etag depends on on the KMS somehow and a local generated ETag is not equal to the one in S3.

Second try was using Object.put. Here you can use ContentMD5 and KMS also works but the function uses a single stream for upload and not mutltipart. single streams cannot upload big files.

So now I am kind of stuck. There is a create MultiPart function and upload_part but I cannot find any examples with ContentMD5 as a whole.

that was the Object.put try

binary_hash = hashlib.md5(open(file_name,'rb').read()).digest()
file_md5 = base64.b64encode(binary_hash)

metadata = {
    "md5sum": file_md5
}

try:
    obj = s3_resource.Object(bucket, fileobj)
    obj.put(
        Body=open(file_name, 'rb'),
        ContentMD5=file_md5,
        Metadata=metadata,
        ServerSideEncryption='aws:kms',
        SSEKMSKeyId=s3kmskey)

1 Answers1

1

Multipart uploads splits the file into chunks. So, you will need to calculate the MD5 checksum of each chunk and then concatenate checksum of all checksum. ETag will be the checksum of above concatenate followed by -n where n is number of parts. This is not available in official documentation.

This python script can do the work for you.

Note: This doesn't work for KMS encryption as the documentation suggests.

You can alternatively calculate the MD5 hash and pass it along the request with Content-MD5 header.

Munavir Chavody
  • 489
  • 4
  • 16
  • 1
    as you already pointed out. this does not work with KMS encryption. Few hours ago I got a response from AWS support. the only solution that I can use from boto3 with multipart upload + ContentMD5 and this in a S3 KMS encrypted bucket would be [create_multipart_upload](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.create_multipart_upload). the only code example on how to implement this, at least what I could find is, (https://gist.github.com/teasherm/bb73f21ed2f3b46bc1c2ca48ec2c1cf5) – noideawhatiamdoing Sep 19 '19 at 13:50