2

Scenario: A frustrated employee deletes all data from an AWS account.

Valuable data: EBS volume of an EC2 machine based on Amazon's Linux AMI.

What is a simple offline backup solution for the data? rsync?

Note that automation of the backup is not part of this question.

feklee
  • 505
  • 5
  • 19
  • 2
    An offline backup is a good feature to implement whether you have a frustrated employee or not. If you do not have full confidence in your employees it might also be good to partition their powers with roles via IAM. – SunSparc Apr 30 '13 at 15:39

2 Answers2

4

rsync is a good choice for moving data between machines (for linux). Be very careful about how you rsync, if you have credentials sitting on the host machine what's to stop them from using them to destroy the target machine too?

Make sure your rsync from the TARGET machine, and only a limited set of people have access to the target machine.

You could look into external hosting providers that do backups for you, and keep the credentials for that account with only a limited set of people.

The other thing you could look into is scripting something with AWS that takes a snapshot of the EC2 and saves it or ships it off to another AWS account or to a special IAM user that only a limited set of people have access to. If you can get your data into S3, you could write your own services to pull it down into another AWS account, or a local computer or another hosting provider etc.

Another useful thing is to not give employees access to delete data from an AWS account unless the need it. This may or may not be practical depending on your situation.

Drew Khoury
  • 4,637
  • 8
  • 27
  • 28
  • Thanks for the suggestions. For the given purpose, it would be great if Amazon offered a [WORM](http://en.wikipedia.org/wiki/Write_once_read_many) storage for data which *never* gets deleted. Something like the [Internet Archive](http://www.archive.org), but where you pay for writing data. – feklee May 01 '13 at 12:27
  • If you feel one of these answers is the most correct and complete please select it as the correct answer. Otherwise let us know if you need further clarification. – Drew Khoury May 05 '13 at 03:35
1

Yes, rsync is the best option. I used to do rsyncing before I automated everything with py boto. You can also create another S3 account and upload there your EBS snapshots you take in AWS EC2. But frustrated employee should not know about that S3 account obviously.

As feklee pointed out there is alternative to S3 which is Amazon Glacier. It is similar to S3 but with complicated pricing structure. But if you want you can read about it in this topic:

https://stackoverflow.com/questions/14652276/backup-amazon-s3-or-glacier-lots-of-little-files

Danila Ladner
  • 5,331
  • 22
  • 31
  • Do you `rsync` just to a directory on your machine? Or do you `rsync` for example to a dedicated virtual machine. I would like to have a solution that works on Windows, by the way. – feklee Apr 30 '13 at 15:38
  • I used to rsync to dedicated offsite machine VPS i colo I had. It wasn't windows though. For windows I would suggest you to look into this: http://www.superflexible.com/partial.htm. – Danila Ladner Apr 30 '13 at 15:44
  • *before I automated everything with py boto* - could you elaborate just a little? You replaced `rsync` by what? – feklee May 06 '13 at 00:45
  • I replaced rsync with my python scripts using AWS EC2 APIs. You can look at boto python library which already has a lot of methods to work with your EC2 instances and EBS volumes. – Danila Ladner May 06 '13 at 00:47
  • Thanks! You may want to add [Glacier](http://aws.amazon.com/glacier/) as an alternative to S3, to make your answer more complete. And perhaps you have an opinion on that? – feklee May 06 '13 at 00:52
  • @DanilaLadner How do you upload your EBS snapshots to S3? I thought EBS snapshots couldn't be managed via S3, even though they use the S3 storage layer behind the scenes. – Martijn Heemels Aug 13 '14 at 10:15