3

If a client is using AWS request signing (Signature Version 4), is there ever a reason to do separate integrity checking for AWS S3 uploads, or is the integrity checking inherent in the protocol adequate?

I'm referring particularly to multi-part uploads, which are described here: https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadComplete.html but also to single-part uploads.

To briefly summarize:

  • Each request to upload a part of a file is signed with a SHA-256 hash of the entire request, including headers and data.

  • In response to each part, AWS returns an ETag, which is a proprietary hash of the data in that part of the file. Usually this is an MD5 of the data for that part, but in the case of AWS-KMS encryption, it's an undocumented algorithm.

  • After all parts are uploaded, the client sends a request that specifies that the individual parts be stitched together into a file/key. The request contains the part numbers, and the AWS-generated ETag of each part.

Some clients do extra checking based on the key's final AWS-generated ETag vs a locally-calculated version of the ETag (which has been discussed at What is the algorithm to compute the Amazon-S3 Etag for a file larger than 5GB? for instance), but is there any point to this?

One of the reasons I ask is that apparently no one has yet reverse-engineered the ETag algorithm used when server-side AWS-KMS encryption is in effect. However, it appears to me that integrity checking is sufficiently inherent in the protocol that additional checking is unnecessary.

Thanks.

Mark R
  • 251
  • 3
  • 12
  • S3 also supports a `Content-MD5` header that must match the payload, or it will be rejected. I use this in addition to `x-amz-content-sha256` because I am paranoid, and no excuse is a good excuse to bypass any validation mechanism available to me. I also pre-calculate the multipart etag and send it as `x-amz-meta-expect-etag` for later validation, and refuse to complete the upload unless the service calculates a matching value. Whether this is all "necessary" seems like a matter of opinion. – Michael - sqlbot May 24 '18 at 22:22
  • @Michael-sqlbot I see the Content-MD5 header documented for single-part uploads: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html but I don't see it documented for multi-part uploads: https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadComplete.html Do you think it does work for multi-part uploads? Thanks. – Mark R May 24 '18 at 22:42
  • Yes, it works for multipart, but you supply it with each individual part, with the hash of the bytes in that part: https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html – Michael - sqlbot May 25 '18 at 00:48

0 Answers0