We host videos in amazon S3 and use cloudfront to deliver them. Here is last months numbers:
4,2 TB total traffic from cloudfront to end user. 4 TB of this is in europe, the rest mainly in the US.
This cost us $512. All as expected.
In addition we have cost of transfer from S3 to edge locations of roughly 1 TB. = cost $115
This is 22,4% of the cost of transfer to the end user and 18% of the combined cost. What I wonder, is that normal? I feel it's a bit much.
We have 280 GB of video stored in S3, spread out on about 20000 videos. Most of our traffic is from videos that are fresh. The top 100 videos for that month was responsible for about 80% of the total traffic.
I just feel that maybe too many videos are unnecessarily downloaded at edge locations and sees few hits. And then they are removed and later it is requested again and we get another download to the edge server.
- Is this ratio normal, or at least somewhat close to other people's experiences?
- Can I somehow tell cloudfront to put a threshold or something before it starts caching the file? Like: don't cache this file until you see x downloads within a day close to your location.
- If the above is not possible, is there other tips on how I could achieve the same. Don't serve lesser frequent accessed files through cloudfront, just go directly to S3. I feel this might be a problem since I use the streaming feature of cloudfront anyway.
I haven't changed the TTL, so I assume it's on the standard 24 hours. This doesn't change anything anyway if I increase it as cloudfront will not download the file again if it has not changed (they don't).
The problem is that the file goes out of cache.