7

I want to use Sha256 for the checksum of my objects. But it looks like, that amazon uses md5 in the ETag.

Is there any workaround?

MarcJohnson
  • 692
  • 2
  • 7
  • 22
  • 2
    Note that etag isn't even always md5, but some aws specific hash consisting of chunks of the original raw data (ie, for multi-part uploads). You can generate this locally, but the format not really documented or stable for client use; better to use your own hash as meta-data. Assume the API uploads it correctly, & use your sha256 to verify the downloads. http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html See also https://stackoverflow.com/questions/12186993/what-is-the-algorithm-to-compute-the-amazon-s3-etag-for-a-file-larger-than-5gb/19896823#19896823 – michael Nov 20 '17 at 09:01

3 Answers3

12

This is possible as of 2022-02-25. S3 now features a Checksum Retrieval function GetObjectAttributes:

New – Additional Checksum Algorithms for Amazon S3 | AWS News Blog

Checksum Retrieval – The new GetObjectAttributes function returns the checksum for the object and (if applicable) for each part.

This function supports SHA-1, SHA-256, CRC-32, and CRC-32C for checking the integrity of the transmission.

I'm so glad that they now have alternatives to the sad choice of MD5, which is not optimal for anything in particular and was broken for other purposes long ago. See also related discussion of quirks with their MD5 approach at How to get the md5sum of a file on Amazon's S3.

[And while I'm discussing hashes for various purposes, note that a good one for hash-table lookups and other situations which have some basic randomness and security propertiees is HighwayHash: Fast strong hash functions: SipHash/HighwayHash]

nealmcb
  • 12,479
  • 7
  • 66
  • 91
6

Unfortunately, there's no direct way to make S3 use SHA256 for ETag. You could use S3 metadata as a workaround. For this, you can calculate the SHA256 checksum yourself and use user defined S3 object metadata to set it for each upload. User defined metadata is just a set of key-value pairs you can assign to your object. You'll have to set the checksum when you PUT your object and compare it on GET/HEAD object.

More information is available in the S3 documentation:

AWS - Object Key and Metadata

Rohan Ashik
  • 73
  • 1
  • 12
user818510
  • 3,414
  • 26
  • 17
  • Thanks for your reply. I was hoping, that there is another Solution. With this solution, the sha256 hash does not get calculated by S3 itself. As a result, the file verification process after the upload is not really trustable ;) – MarcJohnson May 23 '17 at 12:03
  • 1
    Maybe I found another solution. Do you think it is possible to calculate the hash by an AWS Lambda function and save it as user defined s3 object metadata? This solution would be trustable since the hash is calculated on the server side. – MarcJohnson Jun 03 '17 at 12:32
  • @MarcJohnson did you manage to get your above solution working re: user defined sha256 meta data via aws lambda? I would love to know your solution, if so please share? – BenKoshy Dec 25 '19 at 00:38
-2

Please refer: How to calculate SHA-256 checksum of S3 file content

It can be achieved by following steps in Java:

  1. Get InputStream of the S3 Object

InputStream inputStream = amazonS3.getObject(bucket, file).getObjectContent();

  1. Use MessageDigest and DigestInputStream classes for the SHA-256 hash

    public static String getHash(InputStream inputStream, String algorithm) {
        try {
            MessageDigest messageDigest = MessageDigest.getInstance(algorithm);
            DigestInputStream digestInputStream = new DigestInputStream(inputStream, messageDigest);
            byte[] buffer = new byte[4096];
            int count = 0;
            while (digestInputStream.read(buffer) > -1) {
                count++;
            }
            log.info("total read: " + count);
            MessageDigest digest = digestInputStream.getMessageDigest();
            digestInputStream.close();
            byte[] md5 = digest.digest();
            StringBuilder sb = new StringBuilder();
            for (byte b: md5) {
                sb.append(String.format("%02X", b));
            }
            return sb.toString().toLowerCase();
        } catch (Exception e) {
            log.error(e);
        }
        return null;
    }
    
meeza
  • 664
  • 1
  • 9
  • 20