1

Executive summary of the problem. I have a bucket let's call it bucket A that is setup with a default Customer KMS key (will call the id: 1111111) in one account, which we will call 123. In that bucket there are two objects, which are both under the same path within this bucket. They have the same KMS key ID and the same Owner. When I attempt to sync these to a new bucket B in a different account, let's account 456, one is successfully sync'd over but the other is not and instead I get:

An error occurred (AccessDenied) when calling the CopyObject operation: Access Denied

Has anyone seen inconsistent behavior like this before? I say inconsistent because there is absolutely no difference in the access rights between these but one is successful and another isn't. Note: my summary states two objects for simplicity but one of my real cases there are 30 objects where 2 are copying over and the rest failing and within some other paths different mixed results.

The following describes conditions -- some data obfuscated for security but in a consistent manner:

Bucket A (com.mycompany.datalake.us-east-1) Bucket Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123:root",
                    "arn:aws:iam::456:root"
                ]
            },
            "Action": [
                "s3:PutObjectTagging",
                "s3:PutObjectAcl",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
                "arn:aws:s3:::com.mycompany.datalake.us-east-1"
            ]
        },
        {
            "Sid": "DenyIfNotGrantingFullAccess",
            "Effect": "Deny",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123:root",
                    "arn:aws:iam::456:root"
                ]
            },
            "Action": "s3:PutObject",
            "Resource": [
                "arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
                "arn:aws:s3:::com.mycompany.datalake.us-east-1"
            ],
            "Condition": {
                "StringNotLike": {
                    "s3:x-amz-acl": "bucket-owner-full-control"
                }
            }
        },
        {
            "Sid": "DenyIfNotUsingExpectedKmsKey",
            "Effect": "Deny",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123:root",
                    "arn:aws:iam::456:root"
                ]
            },
            "Action": "s3:PutObject",
            "Resource": [
                "arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
                "arn:aws:s3:::com.mycompany.datalake.us-east-1"
            ],
            "Condition": {
                "StringNotLike": {
                    "s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws:kms:us-east-1:123:key/1111111"
                }
            }
        }
    ]
}

Also in the source account, I have created an assumed role, which I call datalake_full_access_role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*",
                "arn:aws:s3:::com.mycompany.datalake.us-east-1"
            ]
        }
    ]
}

Which has a Trusted relationship with account 456. Also worth mentioning is that currently the policy for the KMS key 1111111 is wide open:

{
    "Version": "2012-10-17",
    "Id": "key-default-1",
    "Statement": [
        {
            "Sid": "Enable IAM User Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "kms:Encrypt*",
                "kms:Decrypt*",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:Describe*"
            ],
            "Resource": "*"
        }
    ]
}

Now for the target bucket B (mycompany-us-west-2-datalake) in account 456, the Bucket Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AccountBasedAccess",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::456:root"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::mycompany-us-west-2-datalake",
                "arn:aws:s3:::mycompany-us-west-2-datalake/*"
            ]
        }
    ]
}

To do the migration (the sync) I provision an EC2 instance within the 456 account and attach to it an instance profile that has the following policies attached to it:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::123:role/datalake_full_access_role"
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kms:DescribeKey",
                "kms:ReEncrypt*",
                "kms:CreateGrant",
                "kms:Decrypt"
            ],
            "Resource": [
                "arn:aws:kms:us-east-1:123:key/1111111"
            ]
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::com.mycompany.datalake.us-east-1",
                "arn:aws:s3:::com.mycompany.datalake.us-east-1/security=0/*"
            ]
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::mycompany-us-west-2-datalake",
                "arn:aws:s3:::mycompany-us-west-2-datalake/*"
            ]
        }
    ]
}

Now on the EC2 instance I install latest aws version:

$ aws --version
aws-cli/1.16.297 Python/3.5.2 Linux/4.4.0-1098-aws botocore/1.13.33

and then run my sync command:

aws s3 sync s3://com.mycompany.datalake.us-east-1 s3://mycompany-us-west-2-datalake --source-region us-east-1 --region us-west-2 --acl bucket-owner-full-control --exclude '*' --include '*/zone=raw/Event/*' --no-progress

I believe I've done my homework and this all should work and for several objects it does but not all and I have nothing else up my sleeve to try at this point. Note I have been 100% successful in syncing to a local directory on the EC2 instance and then from the local directory to the new bucket with the following two calls:

aws s3 sync s3://com.mycompany.datalake.us-east-1 datalake --source-region us-east-1 --exclude '*' --include '*/zone=raw/Event/*' --no-progress
aws s3 sync datalake s3://mycompany-us-west-2-datalake --region us-west-2 --acl bucket-owner-full-control --exclude '*' --include '*/zone=raw/Event/*' --no-progress

This absolutely makes no sense as from an access POV there is no difference. The following is a look into the attributes of two objects in the source bucket, one that succeeds and one that fails:

Successful object:

Owner
Dev.Awsmaster

Last modified
Jan 12, 2019 10:11:48 AM GMT-0800

Etag
12ab34

Storage class
Standard

Server-side encryption
AWS-KMS

KMS key ID
arn:aws:kms:us-east-1:123:key/1111111

Size
9.2 MB

Key
security=0/zone=raw/Event/11_96152d009794494efeeae49ed10da653.avro

Failed object:

Owner
Dev.Awsmaster

Last modified
Jan 12, 2019 10:05:26 AM GMT-0800

Etag
45cd67

Storage class
Standard

Server-side encryption
AWS-KMS

KMS key ID
arn:aws:kms:us-east-1:123:key/1111111

Size
3.2 KB

Key
security=0/zone=raw/Event/05_6913583e47f457e9e25e9ea05cc9c7bb.avro

ADDENDUM: After looking through several cases I am starting to see a pattern. I think there may be an issue when the object is too small. In 10 out of 10 directories analyzed where some but not all objects synced successfully, all that were successful had a size of 8MB or more and all that failed were under 8MB. Could this be a bug with aws s3 sync when KMS is in the mix? I am wondering if I can tweak the ~/.aws/config such that it may address this?

Starlton
  • 429
  • 5
  • 14
  • So big question – Arun Kamalanathan Dec 05 '19 at 09:16
  • Arunmainthan, I know there's a lot involved when trying to migrate a private bucket from an old account to a new one, especially when KMS encryption is involved. I've done this many times over for our company for private buckets that used AES-256 encryption and I believe I have added the extra special sauce when using KMS instead, which works "partially". In fact all objects in some prefix directories copy over without issue and they are all using the same KMS key with the same object owner. I cannot find any logical reason why some objects fail. – Starlton Dec 05 '19 at 20:24
  • Possibly related? https://stackoverflow.com/questions/31254640/error-uploading-small-files-to-s3-using-s3cmd – MyStackRunnethOver Dec 05 '19 at 22:57
  • Quick question @Starlton, Is this all about encryption at rest? or If you download an encrypted file using console, will it be encrypted? – Arun Kamalanathan Dec 05 '19 at 23:04
  • S3 uses server-side encryption. If you have access to download an object the downloaded file will be unencrypted. – Starlton Dec 05 '19 at 23:36
  • `Possibly related? stackoverflow.com/questions/31254640/… – MyStackRunnethOver` Thanks for the thought. When I read that post I thought nooo... but I tried it anyway. Turns out in this case that wasn't the problem. Note: How I tested was adding the `"s3:GetBucketLocation"` to the source bucket policy, the assume role in the source account and the policy of the iam instance profile on the target, basically leaving no stone unturned -- same results. – Starlton Dec 06 '19 at 00:07

1 Answers1

1

I found a solution; although, I still think this is a bug with aws s3 sync. By setting the following in the ~./aws/config all objects synced successfully:

[default]
output = json
s3 =
    signature_version = s3v4
    multipart_threshold = 1

The signature_version I had before but figured I would provide it for completeness in case someone has a similar need. The new entry is multipart_threshold = 1, which means an object with any size at all will trigger a multipart upload. I didn't specify the multipart_chunksize, which according to documentation will default to 5MB.

Honestly, this requirement doesn't make sense as it shouldn't matter if the object was uploaded to S3 previously using multipart or not and I know this doesn't matter when KMS isn't involved but apparently it does matter when it is.

Starlton
  • 429
  • 5
  • 14