9

So I'm new to amazon s3 and was wondering if somebody could help answer this questions.

I have a set of static API / JSON files that are used to power a mobile app, while the JSON data for the most part is static an update can be triggerd at any time, causing the data to be updated and the JSON file as such updated as well.

How does amazon handle file updates in regards to access, what I mean by that is if a person is accessing the file at the time I wish to write will it be blocked or does amazon employee some file cache to prevent this from happening.

starball
  • 20,030
  • 7
  • 43
  • 238
proxim0
  • 1,418
  • 2
  • 11
  • 14
  • If you are concerned about stale data in your API, you can use Dynamodb with conditional writes and strongly consistent reads. Conditional writes will allow changes to the data if the data had not changed since the previous read. And [Strongly consistent reads](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html) will always give the same data across all dynamodb nodes. [Working with Conditional Writes](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html#WorkingWithItems.ConditionalUpdate) – strongjz Jan 03 '17 at 18:41

2 Answers2

12

You won't be blocked but you can get stale data if the object already exists and has been recently replaced or deleted.

S3 provides read-after-write consistency for PUTS of new objects.

S3 provides eventual consistency for overwrite PUTS and DELETES. What this means is that user2 could get a stale version of the JSON even though user1 replaced it (sub-second).

Dave Maple
  • 8,102
  • 4
  • 45
  • 64
  • 1
    Thanks Dave! This should work then, basically I trigger my app to look for updates based on a cache key once updates are made and this is only updated after the Json file is created so I don't rhino I'll encounter issues. – proxim0 Jan 03 '17 at 19:07
7

You cannot actually update objects in S3. All you can do is store, retrieve, and delete an entire object. To 'update' the contents of an object, you replace the entire object.

If client A uploads a replacement object and, before that upload is acknowledged, client B accesses the same object key then client B will get the original object.

Here is a supporting quote from the Amazon S3 data consistency model:

Updates to a single key are atomic. For example, if you make a PUT request to an existing key from one thread and perform a GET request on the same key from a second thread concurrently, you will get either the old data or the new data, but never partial or corrupt data.

As of 2020-12-13, S3 now offers strong consistency:

After a successful write of a new object, or an overwrite or delete of an existing object, any subsequent read request immediately receives the latest version of the object. S3 also provides strong consistency for list operations, so after a write, you can immediately perform a listing of the objects in a bucket with any changes reflected.

jarmod
  • 71,565
  • 16
  • 115
  • 122
  • Is this an atomic operation? If you read a file, you get a distributed multipart stream delivered to you overtime (large an object, the more time required and parts received). If some replaces that object mid-streaming, what happens then? – Chris Ivan Feb 04 '22 at 08:10
  • @ChrisIvan good question and I can't immediately find anything authoritative. I did find [this](https://stackoverflow.com/questions/30246784/aws-s3-replace-file-atomically) and [this](https://github.com/aws/aws-cli/issues/2321). – jarmod Feb 04 '22 at 12:07
  • Thanks for sharing, @jarmod. This kind of indicates that, yeah, files do indeed get written only in parts, and that can lead to file corruption if something is attempting to read it as it's being written over. – Chris Ivan Feb 09 '22 at 06:02
  • 1
    @ChrisIvan I've updated my answer with the info from [Amazon S3 data consistency model](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel): "if you make a PUT request to an existing key from one thread and perform a GET request on the same key from a second thread concurrently, you will get either the old data or the new data, but never partial or corrupt data". – jarmod Feb 09 '22 at 13:37