I will answer my own question as this has been bugging me for a while and I just found out a way to resolve this problem. For copying contents of buckets, most if not all solutions I have seen out there resort to guessing the partsize and simply abandon the idea of having matching etags on source and target buckets. Funny enough aws themselves have published the campanile framework which resorts to guessing the part number, and only assuming it has been copied by aws cli tools.
It turns out there is a documented way of doing this: The aws cli tools has an option to the get-object and head-object apis, which lets you specify which part number you want like this:
aws s3api head-object --bucket YOURBUCKET --key YOURKEY --part-number 1
this'll return you a header that looks like this:
{
"AcceptRanges": "bytes",
"ContentType": "application/octet-stream",
"LastModified": "Mon, 31 Jul 2017 08:23:11 GMT",
"ContentLength": 8388608,
"ETag": "\"XXXX-6\"",
"ServerSideEncryption": "AES256",
"PartsCount": 6,
"Metadata": {}
}
In this case as you can see we are told what the part size for this upload should be through the ContentLength header of part number 1: that is 8 MB, the same size as the one used for uploading this object...
if you use the --debug flag you can see how this is done in the REST world: they simply add-on a url parameter partNumber=1
aws --debug s3api head-object --bucket YOURBUCKET --key YOURKEY --part-number 1
....
2017-07-31 16:21:46,968 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=HeadObject) (verify_ssl=True) with params:
{'body': '', 'url': u'https://s3.amazonaws.com/YOURKEY/?partNumber=1',
'headers': {'User-Agent': 'aws-cli/1.11.127 Python/2.7.12 Linux/4.4.35-33.55.amzn1.x86_64 botocore/1.5.90'},
'context': {'auth_type': None, 'client_region': 'us-east-1', 'signing': {'bucket': u'YOURBUCKET'}, 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x7f20a8e1ff50>},
-----> 'query_string': {u'partNumber': 1}, <-----
'url_path': u'/YOURBUCKET/YOURKEY', 'method': u'HEAD'}
....
the next bit is figuring out how to sign such urls. The aws cli command "aws s3 presign" is unable to do that.