68

I have a React app hosted on an S3 bucket. The code is minified using yarn build (it's a create-react-app based app). The build folder looks something like:

build
├── asset-manifest.json
├── favicon.ico
├── images
│   ├── map-background.png
│   └── robot-icon.svg
├── index.html
├── js
│   ├── fontawesome.js
│   ├── packs
│   │   ├── brands.js
│   │   ├── light.js
│   │   ├── regular.js
│   │   └── solid.js
│   └── README.md
├── service-worker.js
└── static
    ├── css
    │   ├── main.bf27c1d9.css
    │   └── main.bf27c1d9.css.map
    └── js
        ├── main.8d11d7ab.js
        └── main.8d11d7ab.js.map

I never want index.html to be cached, because if I update the code (causing the hex suffix in main.*.js to update), I need the user's next visit to pick up on the <script src> change in index.html to point to the updated code.

In CloudFront, I can only seem to exclude paths, and excluding "/" doesn't seem to work properly. I'm getting strange behavior where I change the code, and if I hit refresh, I see it, but if I quit Chrome and go back, I see very outdated code for some reason.

I don't want to have to trigger an invalidation on every code release (via CodeBuild). Is there some other way? I think one of the challenges is that since this is an app using React Router, I'm having to do some trickery by setting the error document to index.html and forcing an HTTP status 200 instead of 403.

ffxsam
  • 26,428
  • 32
  • 94
  • 144

5 Answers5

63

A solution based on CloudFront configuration:

Go to your CloudFront distribution, under the "Behavior" tab and create a new behavior. Specify the following values:

  • Path Pattern: index.html
  • Object Caching: customize
  • Maximum TTL: 0 (or another very small value)
  • Default TTL: 0 (or another very small value)

Save this configuration.

CloudFront will not cache index.html anymore.

foobar443
  • 2,339
  • 3
  • 23
  • 31
  • 2
    Hi @seza443 does this work for index.html files in inner directories, or just for the one in root directory? – Tumaini Mosha Jun 27 '19 at 12:21
  • @TumainiMosha not sure about this. I would say only root directory. And to match `index.html` for sub-directories as all, try `*index.html` ? – foobar443 Jun 28 '19 at 13:11
  • 6
    Thanks for feedback. I can confirm the initial rule (`index.html`) works for inner directories. I checked the cloudfront documentation, basically, their rules match file names, even in inner directories, not just root directory. – Tumaini Mosha Jun 30 '19 at 09:18
  • 4
    In case anyone else was curious: achieving this from the AWS CLI is not very convenient. – bbeecher Mar 20 '20 at 04:05
  • 3
    This is the way to go if you're using CI/CD solutions and your index.html gets updated every time. IMHO it should be accepted as the correct answer. – Giorgio Fellipe Jun 15 '20 at 16:31
  • If I do this do I also need to update anything on the S3 object? – Andrew Nov 13 '20 at 16:42
  • 3
    I think this answer applies to the `Use legacy cache settings` option in `Cache and origin request settings`. If you want to use an existing managed policy by selecting the `Use a cache policy and origin request policy` option, I think the correct value is `Managed-CachingDisabled` – justanotherdev Jun 10 '21 at 03:01
49

If you never want index.html to be cached, set the Cache-Control: max-age=0 header on that file only. CloudFront will make a request back to your origin S3 bucket on every request, but it sounds like this is desired behavior.

If you're wanting to set longer expiry times and invalidate the CloudFront cache manually, you can use a * or /* as your invalidation path (not / as you have mentioned). This can take up to 15 minutes for all CloudFront edge nodes around the world to reflect the changes in your origin however.

Luke Peterson
  • 8,584
  • 8
  • 45
  • 46
  • And are you talking about adding that header in the S3 object metadata? And correct, regardless of the URL path, I never want index.html cached. I'm more concerned about caching the related files (JS files, CSS, images). – ffxsam Aug 17 '17 at 16:43
  • Yes: it's referred to as 'System-Defined Metadata' in the AWS documentation: http://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-object-metadata.html – Luke Peterson Aug 17 '17 at 21:22
  • 11
    Great, this is helpful! I've set up my deploy process to use `aws s3 cp --cache-control max-age=0` when copying over the index file. Works like a charm. – ffxsam Aug 17 '17 at 22:12
  • 1
    I know these are old Q&A, but I've accomplished the same by setting `Cache-Control: no-store` header (as metadata in S3 file) for index.html. – Mauri Q Dec 23 '19 at 14:12
26

Here is the command I ran to set cache-control on my index.html file after uploading new files to s3 and invalidating Cloudfront:

aws s3 cp s3://bucket/index.html s3://bucket/index.html --metadata-directive REPLACE --cache-control max-age=0 --content-type "text/html"
Gal Silberman
  • 3,756
  • 4
  • 31
  • 58
chadhamre
  • 607
  • 7
  • 12
0

Alright the solutions given are all valid, I just want to re-iterate everything you can try to get this setup to work (at least what worked for me):

Have cloudfront take the lead in the caching policy

You don this by making sure the content in your S3 bucket doesn't have the max-age header set to anything besides 0. So for example when using aws s3 sync, you can use:

aws s3 sync ./build/ $S3_BUCKET --delete --cache-control max-age=0

Or when you you use the aws 3 cp command, then please see @chadhamre's solution above.

Please do make sure you check this in your CI pipelines, as I accidentally had this set to a large number and couldn't figure out why any solution like cache invalidation did not work.

What this does: As far as I understand it, the content from the S3 buckets now won't be cached on themselves. But we have cloudfront taking the lead in deciding which caching policy to use ,as cloudfront is responsible for serving the content to the enduser. So based cloudfront's own caching policy, it decides when it needs to make a roundtrip to s3 to get the latest content, where we now know that S3 will always be serving the latest content because of the max-age=0 header.

Cloudfront invalidation

Now you also want the all of your cloudfront cache to be invalidated on each release in your deployment pipeline. Simply invalidate the distribution's cache to do this like this for example:

aws cloudfront create-invalidation --distribution-id $DISTRIBUTION_ID --paths '/*'

Why isn't this enough: Because this invalidates the cached files in cloudfront's edge locations. As @Uğur Dinç also pointed out: When visiting a page the index.html is first requested and has a no cache control header, then the browser itself will not even call the server for a newer version. Meaning even though there is a new file at cloudfront, your browser's cache will then not even make the call. And the main issue here is that the index.html refers to older bundle files which don't exist anymore, so having the old index.html being loaded is exactly what triggers this white screen.

What will this be used for?
This invalidation is perfect for when you do have caching policy for all your other files besides the index.html. As on new release you also want to make sure that any other files like assets their cache will be invalidated. So for example when the index.html loads your new favicon, with the same filename, the newer version will be shown.

Custom caching behavior for index.html

As cloudfront is now fully in charge of the caching policy and the index.html referring to older bundle files is the main issue, we will want to implement a custom caching behaviour for this:

  1. In the aws dashboard click on the distribution at hand
  2. Click on the behaviours tab to see the table with behaviours
  3. Create a new behaviour with the following settings:
    • path pattern: index.html
    • Under "Cache key and origin requests" and as "Cache Policy" select the "CachingDisabled"
  4. Click on create and make sure that in the the table with behaviours, the index.html is at position 0, as this the first behaviour we want cloudfront run.

Final words

I am no expert in this field and perhaps you won't need all 3 of them, but these were the settings that worked for me. Hope this helps someone who has been struggling with this for a while like myself:)

Vasco
  • 837
  • 8
  • 9
-2

It's much better to run an invalidation for index.html on every release than to defeat Cloudfront's purpose and serve it (what is basically an entrypoint for your app) from S3 every single time.

asliwinski
  • 1,662
  • 3
  • 21
  • 38
  • 11
    The problem with that is once the browser sees an index.html without cache control header set for not caching, then the browser won't even go to the server on subsequent calls. As a result, Cloudfront invalidation won't do a thing for the existing users. – Uğur Dinç May 22 '20 at 16:22
  • 1
    @UğurDinç "The problem with that is once the browser sees an index.html without cache control header set for not caching, then the browser won't even go to the server on subsequent calls." It actually does, at least Chrome. Still, to make it more deterministic, you can add the header to S3 objects. E.g. if you set minimum TTL > 0 and greater than max-age, Cloudfront will cache the objects for the value of the TTL (while the browser will use max-age). See: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html – asliwinski May 23 '20 at 03:32
  • We're finding that not all edge locations are being cleared when using invalidations. We're looking to mark index.html as not validating (we've built an SPA in Angular and noticing that some users are getting the old version). – jackofallcode Jan 25 '21 at 10:07
  • @jackofallcode I'm observing the same issue lately, have you perhaps found any solution to this problem? – asliwinski Jan 26 '21 at 17:16
  • @asliwinski we've opted to setup a cache behaviour on the CloudFront Distribution to not cache index.html. With Angular its only small and the main core app (the JS and the CSS) is cached as expected (and new builds create new files rather than updating the existing ones). Not ideal but a workaround. – jackofallcode Jan 26 '21 at 20:59