7

I'm building some AMIs from one of the basic ones on EC2. One of the instance types is running Tomcat and contains a lot of Lucene indexes; another instance will be running MySQL and have correspondingly large data requirements with it.

I'm trying to define the best way to include those in the AMIs that I'm authoring. If I mount /mnt/lucene and /mnt/mysql, those don't get included in the AMI generated. So it seems to me like the preferred way to deal with those is to have an EBS for each one, take snapshots and spin up instances which have their own EBS based on the most recent snapshots. Is that the best way to proceed?

What is the point of instance storage? It seems like it will only work as a temporary storage area - what am I missing? Presumably there is a reason Amazon offer up to 800GB of storage on standard large instances...

jabley
  • 2,212
  • 2
  • 19
  • 23

3 Answers3

4

Instance storage is faster than EBS. You don't mention what you will be doing with your instances, but for some applications speed might be more important than durability. For an application that is primarily doing data mining on a large database, having a few hundred gigs of local, fast storage to host the DB might be beneficial. Worker nodes in a MapReduce cluster might also be great candidates for instance storage, depending on what type of job it was.

Peter Recore
  • 14,037
  • 4
  • 42
  • 62
  • That's what I thought. I could do with finding some numbers for how instance storage varies versus EBS though. I ran bonnie++ on instance storage and it didn't blow my socks off. – jabley Jul 27 '09 at 21:57
  • I guess the question is how much did it (not) blow your socks off as compared to the same benchmark on ebs :) I have a feeling this is one of those situations where the right choice is going to differ for everyone, and you'll have to figure out what mix of the available options work best for your particular problem. The beauty of the situation is that if one node takes two hours to churn through your data, you can always just rent two nodes and do it in 1 hour! (assuming you are blessed with parallelisable tasks) – Peter Recore Jul 28 '09 at 01:16
  • I wonder how instance storage holds up against a RAID of small EBS volumes though. Perhaps once you're using RAIDs, there's no reason to use instance storage anymore (at least for seek-heavy DB access). – Jo Liss Mar 01 '11 at 22:27
  • 4
    Are you sure instance storage is faster than EBS? I have heard otherwise, and the docs also say "if you need to improve the storage latency or throughput, we recommend using Amazon EBS": http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/instance-storage-concepts.html The main advantage seems to be that instance storage is cheaper. – Thilo Mar 13 '11 at 03:14
  • 1
    I don't know if I was just plain wrong back then or if EBS has gotten faster over the years. – Peter Recore Mar 14 '11 at 16:02
  • Most information I can find online suggests that instance storage is faster in most cases, but it doesn't seem to be clear cut. The only certainty I can find is that EBS performance is highly variable, sometimes good, sometimes poor. – El Yobo May 10 '11 at 02:56
  • http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html - seems thorough – El Yobo May 10 '11 at 05:04
2

Another point of instance storage is that it's independent. There have been many EBS outages (google e.g. "site:aws.amazon.com ebs outage"). If the instance runs at all, it has the instance storage available. Obviously if you rely on instance storage, you need to run multiple instances (on multiple availability zones) and tolerate single failing instances.

0

I know this is late to the game, but one other little considered factoid...

EBS storage makes it exceedingly easy to create AMI's from, whereas, instance-store based storage requires that creation of AMI's be done locally on the machine itself with a whole bunch of work to prep, store, and register the AMI.

oucil
  • 4,211
  • 2
  • 37
  • 53