0

I'm using AWS spot instances and want to get the same "stop/start/setup" speed as on-demand instances. With on-demand instances, the root EBS volume stays around, so once I install all the packages, subsequent start/setup is fast.

However, for spot instances I get "Spot instances can not be stopped" (why not?). And I don't see a way to start a spot instance with an existing Root Volume.

One partial work-around I found was to mount an existing secondary volume under "/data", install packages there, and remount that volume on spot instance restart. However, this is limiting because some packages like to be installed into '/', any suggestions?

Yaroslav Bulatov
  • 57,332
  • 22
  • 139
  • 197
  • 3
    Why not bake an AMI image with your app and data pre configured and launch with that? – Rodrigo Murillo Aug 14 '18 at 01:23
  • AMIs are on S3 so first time I do "conda activate pytorch_p36" it takes several minutes, meanwhile EBS is already pre-warmed – Yaroslav Bulatov Aug 14 '18 at 01:48
  • Amazon EBS-Backed Amis launch in less than 1 minute. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ComponentsAMIs.html#storage-for-the-root-device – Rodrigo Murillo Aug 14 '18 at 02:13
  • Are you launching a stock AMI and then configuring the software at launch time? – Rodrigo Murillo Aug 14 '18 at 02:16
  • But don't EBS-backed AMI's need prewarming? More specifically I noticed that on official DLAMI the following takes 40 seconds the first time I run it `source activate pytorch_p36; python -c "import torch; print('hello')`, the second time it's instantaneous – Yaroslav Bulatov Aug 14 '18 at 03:26
  • I just confirmed that stopping an instance/then starting again, the commands above are instantaneous, so it's issue of prewarming – Yaroslav Bulatov Aug 14 '18 at 03:35
  • You cannot stop a Spot Instance because there is no guarantee you'd be allowed to Start it again later at the given price. – John Rotenstein Aug 14 '18 at 06:20
  • Persistent Spot instances can be stopped and restarted (feature added Sep-2017). However, this might be only for AWS to control your spot instances: https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec2-spot-can-now-stop-and-start-your-spot-instances/ – John Hanley Aug 15 '18 at 07:46
  • @John -- hm, there's already no guarantee I can start an instance again for regular instances -- I've had cases when instance wouldn't start due to "out of capacity" on AWS side – Yaroslav Bulatov Aug 16 '18 at 00:50
  • Are you referring to normal EC2 instances? What size? I have heard of this, but in more than 10 years launching instances everyday with AWS, I have never seen this issue. I don't use T2.micro type of instances, usually mid-range to large instances. I consider an instance with 4 GB of memory "small". – John Hanley Aug 16 '18 at 01:09
  • I pretty regularly run into "out of capacity" for p3.16xlarge instances – Yaroslav Bulatov Aug 16 '18 at 01:45

1 Answers1

0

The feature to stop spot instances is only available when AWS preempts an instance. A user cannot request to stop spot instances, only terminate them. This forces a fresh fetch from S3 to create the root volume on every launch. Since blocks are lazily loaded, the experience will be varied latency on every boot.

To get EBS volumes without any S3 penalty, the volumes must pre-exist and be mounted to the spot instance when launched.

One solution to get the feel of having a "warm" root volume is to use chroot with the attached volume.

Every AMI has a snapshot ID. That snapshot ID can be provisioned as 1 or many EBS standalone volumes. These volumes will act like a stopped on-demand instance. If the intent is to get speed and not any higher level of security, once the volume is mounted, bind system paths to the chroot location. Something similar to the following will work in most cases:

mount -o bind /proc /mnt/myMount/proc
mount -o bind /sys /mnt/myMount/sys
mount -o bind /dev /mnt/MyMount/dev
mount -o bind /dev/pts /mnt/MyMount/dev/pts
mount -o bind /tmp /mnt/MyMount/tmp

mount -o bind /run /mnt/MyMount/run
mount -o bind /run/lock /mnt/MyMount/run/lock
mount -o bind /dev/shm /mnt/MyMount/dev/shm

Last, configure ssh with chroot to use the mounted path as root:

Match User ubuntu
    ChrootDirectory /mnt/dlami

Now when the spot instance launches and the volume is mounted the user will be placed on the attached EBS, where blocks are only retrieved from S3 once (as with on-demand) and kept between instance associations. The spot instance can be terminated and a new instance can remount the "warm" storage.

You will need to have a system in place to match existing EBS volumes with new spot requests, as well as UserData or API calls that will take care of attaching the volumes and setting up chroot.

At Spotinst, we thought this was an exciting use case and wrote a blog to go further into detail here: https://blog.spotinst.com/2018/10/09/imagenet-ec2-spot/

Kevin McGrath
  • 146
  • 1
  • 5