21

Let's say I've got S3 versioning enabled for my bucket: http://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html

Then, let's say someone (for example, junior employee) messes up the S3 bucket (deletes some files accidentally, etc.)

How can I then restore the entire versioned bucket to a particular point in time? I believe this should be possible given S3's API, but I'd rather not have to write such a script myself, for fear of missing something (I'm not an AWS expert).

Are there are good solution to this problem? I'm using the S3 bucket as an image store for my Rails app, so something Ruby-based that I could use as a rake task would be ideal.

elsurudo
  • 313
  • 1
  • 2
  • 6

2 Answers2

20

You can use s3-pit-restore

S3 Point in Time Restore is a tool you can use exactly to restore a bucket or a subset of a bucket to a given point in time, like this:

s3-pit-restore --bucket my-bucket --dest my-restored-bucket --timestamp "06-17-2016 23:59:50 +2"

What s3-pit-restore actually offers:

  • Restore of all files with timestamp less than the given one
  • Restore of a whole bucket or a bucket prefix
  • Parallel download of multiple files with a great overall speed
  • Customization of parallel workers count to optimize bandwidth usage
  • Restore from s3 bucket versions or from glacier if enabled
Angelo
  • 316
  • 2
  • 4
2

If I understand the documentation correctly, when you have versioning enabled deleting the file simply reverts the "latest" version back one version number. This however does not give the ability to restore an entire bucket. This makes the previous versions in S3 not suitable for your needs (i.e, recovery from deletion).

Keep a backup someplace else as well just in case. Stack Overflow has a question/answer on this using s3cmd. I'm sure you could find a Ruby-based script somewhere or ask on that site for help if you need it.

Nathan C
  • 15,059
  • 4
  • 43
  • 62
  • Correct. You're versioning each individual object in the bucket, not the bucket as a whole. – EEAA Apr 17 '14 at 12:35
  • 1
    Oh, I understand all that. Which is why I realize it's not that simple. I'd probably have to traverse all the files in the bucket, get version info for each file, and then pick the correct item (if it exists) based on the date time I want to "revert" to. Not so simple. I did figure Amazon would have though of something for such a common use case, but alas, no... So I was wondering if someone wrote this tedious script already. I will look into `s3cmd`, but I do like having versioned snapshots on S3 as well. – elsurudo Apr 17 '14 at 18:43
  • this answer contains incorrect information - a simple `delete` inserts a delete marker and future requests return a 404, not the previous version. To RESTORE you can copy an old version to a new version or you can `delete` with a specific version of the current object - then future `get`s are given the second-to-latest version. http://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html to be fair, the docu around bucket versioning tends to be vague and lacking... – keen Jan 12 '17 at 15:46
  • @keen Note that this question was answered almost three years ago...it's very possible they updated the documentation. Good catch, though. – Nathan C Jan 12 '17 at 18:30
  • the link referenced for docu is about recovery (the super high level recovery view) and hasn't changed - and to be fair, it's more than a little confusing when it starts talking about deletes. just wanted to make sure no one saw this and thought "just deleting an object from a versioned s3 bucket means the old version will start being returned" - that CAN happen, but you have to specifically delete the current version (.../key?versionId=xyz) for that to happen... – keen Jan 12 '17 at 22:08