0

I would like to upload content to S3 and but schedule a time at which Cloudfront delivers it to clients rather than immediately vending it to clients upon processing. Is there a configuration option to accomplish this?

EDIT: This time should be able to differ per object in S3.

Master_Yoda
  • 1,092
  • 2
  • 10
  • 18
  • 1
    What do you mean by "upon processing?" CloudFront and S3 don't "process" things, so precisely what you mean may need clarification. If someone visits the link "early," do you want them to get a `404 Not Found`? If not, then what? What is the application for this? There's no native, straightforward capability in S3 or CloudFront for a Timed Content Embargo™, which is a name I just made up to describe this, but there might be a somewhat creative workaround. Please clarify. – Michael - sqlbot Oct 16 '15 at 00:08
  • Thanks for replying, Michael. Processing here meaning the process of generating the objects and uploading them to S3. When someone visits the link early I want them to get a 403 Forbidden ideally. – Master_Yoda Oct 16 '15 at 01:11
  • 1
    I don't see any obvious way for S3/Cloudfront to handle a Timed Content Embargo (cc @Michael-sqlbot) but the easiest way to do this is probably to upload to S3 as private, then mark the object public at the end of the embargo period. You'll need to use a short TTL since a 403 will be returned until then. Lambda (especially with cron, python, and boto3) would be perfect for this. – tedder42 Oct 18 '15 at 17:55
  • @tedder42 - I was actually looking at using Lambda the same way that you suggested. Was just hoping there was a native way to do this built into the CDN. – Master_Yoda Oct 18 '15 at 18:22
  • @tedder42 thanks :) after a few hours of kicking this around, it turns out there *is* a native, built-in way, after all... just not a pretty one. – Michael - sqlbot Oct 19 '15 at 02:58

2 Answers2

2

There is something of a configuration option to allow this, and it does allow you to restrict specific files -- or path prefixes -- from being served up prior to a given date and time... though it's slightly... well, I don't even know what derogatory term to use to describe it. :) But it's the only thing I can come up with that uses entirely built-in functionality.

First, a quick reminder, that public/unauthenticated read access to objects in S3 can be granted at the bucket level with bucket policies, or at the object level, using "make everything public" when uploading the object in the console, or sending x-amz-acl: public-read when uploading via the API. If either or both of these is present, the object is publicly readable, except in the face of any policy denying the same access. Deny always wins over Allow.

So, we can create a bucket policy statement matching a specific file or prefix, denying access prior to a certain date and time.

{
    "Version": "2012-10-17",
    "Id": "Policy1445197123468",
    "Statement": [
        {
            "Sid": "Stmt1445197117172",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::example-bucket/hello.txt",
            "Condition": {
                "DateLessThan": {
                    "aws:CurrentTime": "2015-10-18T15:55:00.000-0400"
                }
            }
        }
    ]
}

Using a wildcard would allow everything under a specific path to be subject to the same restriction.

"Resource": "arn:aws:s3:::example-bucket/cant/see/these/yet/*",

This works, even if the object is public.

This example blocks all GET requests for matching objects by anybody, regardless of permissions they may have. Signed URLs, etc., are not sufficient to override this policy.

The policy statement is checked for validity when it is created; however, the object being matched does not have to exist, yet, so if the policy is created before the object, that doesn't make the policy invalid.

Live test:

Before the expiration time: (unrelated request/response headers removed for clarity)

$ curl -v example-bucket.s3.amazonaws.com/hello.txt
> GET /hello.txt HTTP/1.1
> Host: example-bucket.s3.amazonaws.com
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Sun, 18 Oct 2015 19:54:55 GMT
< Server: AmazonS3
<
<?xml version="1.0" encoding="UTF-8"?>
* Connection #0 to host example-bucket.s3.amazonaws.com left intact
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>AAAABBBBCCCCDDDD</RequestId><HostId>g0bbl3dyg00kbunc4Ofl1n3n0iz3h3rehahahasqlbot1337kenqweqwel24234kj41l1ke</HostId></Error>

After the specified date and time:

$ curl -v example-bucket.s3.amazonaws.com/hello.txt
> GET /hello.txt HTTP/1.1
> Host: example-bucket.s3.amazonaws.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 19:55:05 GMT
< Last-Modified: Sun, 18 Oct 2015 19:36:17 GMT
< ETag: "78016cea74c298162366b9f86bfc3b16"
< Accept-Ranges: bytes
< Content-Type: text/plain
< Content-Length: 15
< Server: AmazonS3
<
Hello, world!

These tests were done against the S3 REST endpoint for the bucket, but the website endpoint for the same bucket yields the same results -- only the error message is in HTML rather than XML.

The positive aspect of this policy is that since the object is public, the policy can be removed any time after the date passes, because it is denying access before a certain time, rather than allowing access after a certain time -- logically the same, but implemented differently. (If the policy allowed access after rather than denying access before, the policy would have to stick around indefinitely; this way, it can just be deleted.)

You could use custom error documents in either S3 or CloudFront to present the viewer with a slightly nicer output... probably CloudFront, since you can select customize each error code individually, creating a custom 403 page.

The major drawbacks to this approach are, of course, that the policy must be edited for each object or path prefix and even though it works per-object, it's not something that's set per object.

And there is a limit to how many policy statements you can include, because of the size restriction on bucket policies:

Note

Bucket policies are limited to 20 KB in size.

http://docs.aws.amazon.com/AmazonS3/latest/dev/access-policy-language-overview.html


The other solution that comes to mind involves deploying a reverse proxy component (such as HAProxy) in EC2 between CloudFront and the bucket, passing the requests through and reading the custom metadata from the object's response headers, looking of a header such as x-amz-meta-embargo-until: 2015-10-18T19:55:00Z and comparing its value to the system clock; if the current time is before the cutoff time, the proxy would drop the connection from S3 and replace the response headers and body with a locally-generated 403 message, so the client would not be able to fetch the object until the designated time had passed.

This solution seems fairly straightforward to implement, but requires a non-built-in component, so it doesn't meet the constraint of the question and I haven't built a proof of concept; however, I already use HAProxy with Lua in front of some buckets to give S3 some other capabilities not offered natively, such as removing sensitive custom metadata from responses and modifying, and directing the browser to apply an XSL stylesheet to, the XML on S3 error responses, so there's no obvious reason that comes to mind why this application wouldn't work equally well.

Community
  • 1
  • 1
Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
  • 1
    Wow, this is probably the best answer I've received here. Seems like it will probably be easier / more maintainable to either move the files from a forbidden bucket to a bucket with read access or to change the permissions when they are ready to be released. Thanks for taking the time to present this fascinating option. – Master_Yoda Oct 19 '15 at 02:00
  • @Master_Yoda I may have overlooked something obvious, here... if the link you give your users is a *CloudFront* (not S3) pre-signed URL, you can embed a timestamp in the signed policy document, restricting access prior to the specified time. http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-creating-signed-url-custom-policy.html#private-content-custom-policy-statement – Michael - sqlbot Oct 26 '15 at 15:59
-1

Lambda@edge can apply your customized access control easily

billibit
  • 1
  • 1
  • Please elaborate by adding an example. – JJJ May 06 '19 at 00:28
  • Lambda@edge Notejs ` exports.handler = (event, context, callback) => { accessrules( rules, request, callback); } function accessrules(rules, request, callback) { if ( rules.timeValidation.enabled == true) { var allowed = checkAllowedFromTimeValidation(headers, rules.timeValidation.dateFrom, rules.timeValidation.dateTo); if ( allowed) { callback(null, request); return; } } ` – billibit May 17 '19 at 16:22