9

How would you best handle persistent data between instances with a load-balanced service in Amazon ECS? Data only containers will not work and neither will the volumes you can specify in the tasks, they will both only persist on the instance itself. I have been trying to read up on attaching a EBS upon instance creation with User Data in Launch Configuration but i had no luck there.

helloV
  • 50,176
  • 7
  • 137
  • 145
Sultanen
  • 3,084
  • 5
  • 25
  • 46
  • How much data? Is it read only? – Rodrigo Murillo Jan 27 '16 at 00:00
  • I need to store the MySQL database + user uploaded content. No huge amounts of data but it needs to be R+W. I use a Linux envirorment – Sultanen Jan 27 '16 at 00:14
  • 1
    Amazon ECS data volumes is what you looking for http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html – number5 Jan 27 '16 at 00:32
  • @number5 From that page you can read that data volumes does not sync between instances and thats kinda useless when using autoscaling that can delete any instance when its no longer needed? "Amazon ECS does not sync your data volumes across container instances. Tasks that use persistent data volumes can be placed on any container instance in your cluster that has available capacity. If your tasks require persistent data volumes after stopping and restarting, you should always specify the same container instance at task launch time with the AWS CLI start-task command." – Sultanen Jan 27 '16 at 15:56
  • @Sultanen sorry I misunderstood your question. What you want is actually a persistent storage for a docker cluster (like Swarm). I would suggest you looking at RDS like Aurora or Mysql + S3 (for user upload content) Also check out kubernetes (which can be run on normal EC2 smoothly) – number5 Jan 28 '16 at 05:17

3 Answers3

11

You can use Amazon EFS to share a filesystem across ECS containers and instances. EFS is based on NFS so it can be mounted at multiple host instances at the same time. This allows cluster scheduling and scaling to work as intended. See a tutorial for persisting MySQL data this way here:

https://aws.amazon.com/blogs/compute/using-amazon-efs-to-persist-data-from-amazon-ecs-containers/

Wouter de Winter
  • 701
  • 7
  • 11
4

I suggest using Amazon EFS ( https://aws.amazon.com/blogs/compute/using-amazon-efs-to-persist-data-from-amazon-ecs-containers/).

Just add a limitation that there are only 4 regions to support EFS.

EU (Ireland)

US East (N. Virginia)

US East (Ohio)

US West (Oregon)

If your region is not supported then we can implement your own NFS share to share persistent folder between EC2 instances. S3FS looks cool but it's buggy ( I tested 2 years ago. Things may change today)

iapilgrim
  • 131
  • 1
  • 2
2

Depending on data needs you have two options I can think of:

Mapping S3 bucket as a local drive

You can share an S3 bucket and limit access to any number of instances. We use a drive mapping solution in Windows that will mount an S3 bucket as a local drive. Similar drivers exist for Linux. So each instance gets the same mapped drive, and share that persistent data. The data is read/write, so if we scale in or out, each instance has access to the S3 data in a consistent format.

Mount a volume from a Snapshot

As you suggest, if it is read-only data that you need access to, you can use Userdata scripts to mount a volume from a snapshot at launch time. You just need a script, and credentials/IAM Role to run the appropriate commands at launch time

Rodrigo Murillo
  • 13,080
  • 2
  • 29
  • 50
  • Thanks for the input! I will try to find a S3 mapper for Linux! – Sultanen Jan 27 '16 at 00:18
  • I added a reference to a Linux S3 mapper. – Rodrigo Murillo Jan 27 '16 at 00:27
  • 3
    @Sultanen but absolutely **do not** try to run a MySQL database over an S3 mapper. S3 is an object store, not a filesystem, and lacks the necessary consistency guarantees for this to work properly. – Michael - sqlbot Jan 27 '16 at 01:32
  • @Michael-sqlbot Thanks for the comment, we will try to go for RDS for the MySQL database instead. – Sultanen Jan 27 '16 at 15:57
  • @RodrigoM Thanks for the link, was looking into s3fs, i will mark this answer as accepted :) – Sultanen Jan 27 '16 at 15:57
  • 1
    I should follow up that this is not a limitation in s3fs itself. I use it, but not on the front-end, and not for databases. It's a clever way of doing things but you have to understand that there's an impedance gap between proper filesystems and object stores that cannot be fully bridged. Additionally, MySQL doesn't work over any kind of shared volumes. For other things, though, once it is available in your regions, Elastic File System is quite nice. – Michael - sqlbot Jan 27 '16 at 17:40
  • I feel like its kinda strange that amazon don't have a more standardised way of getting persistent data between instances? EFS is maby the answer though? – Sultanen Jan 29 '16 at 08:26
  • Amazon EFS is probably the answer moving forward: https://aws.amazon.com/blogs/compute/using-amazon-efs-to-persist-data-from-amazon-ecs-containers/ – Yong Jie Wong Feb 28 '16 at 13:39